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PREFACE 


During recent years there has been an ever increasing interest in modem 
algebra not only of students in mathematics but also of those in physics, 
chemistry, psychology, economics, and statistics. My Modern Higher Alge¬ 
bra was intended, of course, to serve primarily the first of these groups, and 
its rather widespread use has assured me of the propriety of both its con¬ 
tents and its abstract mode of presentation. This assurance has been con¬ 
firmed by its successful use as a text, the sole prerequisite being the subject 
matter of L. E. Dickson^s First Course in the Theory of Equations. However, 
I am fully aware of the serious gap in mode of thought between the intuitive 
treatment of algebraic theory of the First Course and the rigorous abstract 
treatment of the Modern Higher Algebra j as well as the pedagogical difficulty 
which is a consequence. 

The publication recently of more abstract presentations of the theory of 
equations gives evidence of attempts to diminish this gap. Another such at¬ 
tempt has resulted in a supposedly less abstract treatise on modern algebra 
which is about to appear as these pages are being written. However, I have 
the feeling that neither of these compromises is desirable and that it would 
be far better to make the transition from the intuitive to the abstract by the 
addition of a new course in algebra to the undergraduate curriculum in 
mathematics—a curriculum which contains at most two courses in algebra 
and these only partly algebraic in content. 

This book is a text for such a course. In fact, its only prerequisite ma¬ 
terial is a knowledge of that part of the theory of equations given as a chap¬ 
ter of the ordinary text in college algebra as well as a reasonably complete 
knowledge of the theory of determinants. Thus, it would actually be pos¬ 
sible for a student with adequate mathematical maturity, whose only train¬ 
ing in algebra is a course in college algebra, to grasp the contents. I have used 
the text in manuscript form in a class composed of third- and fourth-year 
undergraduate and beginning graduate students, and they all seemed to find 
the material easy to understand. I trust that it will find such use elsewhere 
and that it will serve also to satisfy the great interest in the theory of matrices 
which has been shown me repeatedly by students of the social sciences. 

I wish to express my deep appreciation of the fine critical assistance of 
Dr. Sam Perils during the course of publication of this book. 

A. A. Albert 


University op Chicago 
Septeinber 9, 1940 
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CHAPTER I 


POLYNOMIALS 

1. Polynomials in x. There are certain simple algebraic concepts with 
which the reader is probably well acquainted but not perhaps in the termi¬ 
nology and form desirable for the study of algebraic theories. We shall thus 
begin our exposition with a-discussion of these concepts. 

We shall speak of the familiar operations of additiony subtractiony and 
multiplication as the integral operations. A positive integral power is then 
best regarded as the result of a finite repetition of the operation of multi¬ 
plication. 

A polynomial f(x) in x is any expression obtained as the result of the ap¬ 
plication of a finite number of integral operations to x and constants. If 
g(x) is a second such expression and it is possible to carry out the operations 
indicated in the given formal expressions for/(a:) and g(x) so as to obtain 
two identical expressions, then we shall regard f(x) and g{x) as being the 
same polynomial. This concept is frequently indicated by saying that/(a:) 
and g{x) are identically equal and by writing f{x) = g{x). However, we 
shall usually say merely that/(a;) and g{x) are equal polynomials and write 
/(^) — 9 We shall designate by 0 the polynomial which is the constant 
zero and shall call this polynomial the zero polynomial. Thus, in a discus¬ 
sion of polynomials /(a:) = 0 will mean that f(x) is the zero polynomial. 
No confusion will arise from this usage for it will always be clear from the 
context that, in the consideration of a conditional equation/(a:) = 0 where 
we seek a constant solution c such that/(c) = 0, the polynomial/(a:) is not 
the zero polynomial. We observe that the zero polynomial has the 
properties 

0 - g^x) = 0, 0 + jr(x) = g{x) - 

for every polynomial g{x). 

Our definition of a polynomial includes the use of the familiar term con¬ 
stant. By this term we shall mean any complex number or function inde¬ 
pendent of x. Later on in our algebraic study we shall be much more ex¬ 
plicit about the meaning of this term. For the present, however, we shall 
merely make the unprecise assumption that our constants have the usual 
properties postulated in elementary algebra. In particular, we shall assume 
the properties that if a and b are constants such that afe = 0 then either 

1 
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a or 6 is zero; and if a is a nonzero constant then a has a constant inverse 

such that aar^ = 1. 

If f{x) is the label we assign to a particular formal expression of a poly¬ 
nomial and we replace x wherever it occurs in/(a:) by a constant c, we ob¬ 
tain a corresponding expression in c which is the constant we designate by 
/(c). Suppose now that g{x) is any different formal expression of a poly¬ 
nomial in X and that f{x) = g{x) in the sense defined above. Then it is 
evident that/(c) = g{c). Thus, in particular, if /i(x), q{x), r{x) are poly¬ 
nomials in X such that /(x) = h{x)q{x) + r(x) then /(c) = h{c)q{c) + t{c) 
for any c. For example, we have /(x) = x® — 2x^ + 3x, h{x) = x — 1, 
q{x) = x* — X, r(x) = 2x, and are stating that for any c we have c^ — 
2c2 + 3c = (c - l)(c2 - c) + 2c. 

If the indicated integral operations in any given expression of a poly¬ 
nomial/(x) be carried out, we may express/(x) as a sum of a finite num¬ 
ber of terms of the form ax*. Here fc is a non-negative integer and a is a 
constant called the coefficient of x*. The terms with the same exponent k 
may be combined into a single term whose coefficient is the sum of all their 
coefficients, and we may then write 

(1) /(x) = UoX” + aiX’^-^ + . . . + On-lX + an . 

The constants a< are called the coefficients of /(x) and may be zero, but 
unless /(x) is the zero polynomial, we may always take ao 0. The ex¬ 
pression (1) of /(x) with ao 7 *^ 0 is most important since, if g{x) is a second 
polynomial and we write gix) in the corresponding form 

(2) g{x) == boX”^ + bix”^-^ + . . . + 6m 

with 6o 5*^ 0, then /(x) and g(x) are equal if and only if m = n, at = bi for 
i = 0, . . . , n. In other words, we may say that the expression (1) of a 
polynomial is unique, that is, two polynomials are equal if and only if their 
expressions (1) are identical. 

The integer n of any expression (1) of /(x) is called the virtual degree of 
the expression (1). If ao 0 we call n the degree"^ of /(x). Thus, either any 
/(x) has a positive integral degree, or/(x) = an is a constant and will be 
called a constant polynomial in x. If, then, an 0 we say that the con¬ 
stant polynomial /(x) has degree zero. But if an = 0, so that /(x) is the 
zero polynomial, we shall assign to it the degree minus infinity. This will 

* Clearly any polynomial of degree no may be written as an expression of the form (1) 
of virtual degree any integer n ^ no. We may thus speak of any such n as a virtual de¬ 
gree of f(x). 
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be done so as to imply that certain simple theorems on poljmomials shall 
hold without exception. 

The coefficient ao in (1) will be called the virtual leading coefficient of this 
expression of/(x) and will be called the leading coefficient off{x) if and only 
if it is not zero. We shall call f(x) a monic polynomial if ao = 1. We then 
have the elementary results referred to above, whose almost trivial verifica¬ 
tion we leave to the reader. 

Lemma 1. The degree of a product of two polynomials f(x) and g(x) is the 
sum of the degrees of f(x) and g(x). The leading coefficient of f(x) • g(x) is 
the product of the leading co^cients of f(x) and g(x), and thuSy if f(x) and 
g(x) are monicj so is f(x) • g(x). 

Lemma 2. A product of two nonzero polynomials is nonzero and is a con- 
slant if and only if both factors are constants. 

Lemma 3. Let f(x) be nonzero and such that f(x)g(x) = f(x)h(x). Then 
g(x) = h(x). 

Lemma 4. The degree of f(x) + g(x) is at most the larger of the two degrees 
of f(x) and g(x). 

EXERCISES* 

1. State the condition that the degree of f{x) + g{x) be less than the degree of 
either/(a;) or g{x), 

2. What can one say about the degree of f{x) + if /(^) and g{x) have posi¬ 
tive leading coefficients? 

3. What can one say about the degree of/*, of /®, of /* for / = f(x) a polynomial, 
k a positive integer? 

4. State a result about the degree and leading coefficient of any polynomial 
= /i + •••+/? for i > 1, /» = fi{x) a polynomial in x with real coefficients. 

5. Make a corresponding statement about g{x)s(x) where g(x) has odd degree 
and real coefficients, s{x) as in Ex. 4. 

6. State the relation between the term of least degree in f{x)g{x) and those of 
least degree infix) and gix), 

7. State why it is true that if x is not a factor of fix) or gix) then x is not a fac¬ 
tor of fix)gix), 

8. Use Ex. 7 to prove that if A; is a positive integer then x is a factor of [/(x)]* if 
and only if x is a factor of fix). 

9. Let / and g be polynomials in x such that the following equations are satisfied 
(identically). Show, then, that both/and g are zero. Hint: Verify first that other- 

* The early exercises in our sets should normally be taken up orally. The author’s 
choice of oral exercises will be indicated by the language employed. 
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wise both f and g are not zero. Express each equation in the form a{x) = 6(x) and 
apply Ex. 3. In parts (c) and (d) complete the squares. 

а) P + xgf* = 0 c) P + 2x/V + ”” ^)g^ = 0 

б) P — = 0 d) P + 2 xfg — xg^ = 0 

10. Use Ex. 8 to give another proof of (a), (6), and (d) of Ex. 9. Hint: Show that 
if / and g are nonzero polynomial solutions of these equations of least possible de¬ 
grees, then X divides / = xf\ as well as g — xgi. But then /i and gi are also solutions 
— a, contradiction. 

11. Use Ex. 4 to show that if /, and h are polynomials in x with real coefficients 
satisfying the following equations (identically), then they are all zero: 

a) p — xg^ = xh^ 

b) P -xg^ + h^ = 0 

c) P + g^+(x + = 0 

12. Find solutions of the equations of Ex. 11 for polynomials/, h with complex 
coefficients and not all zero. 

2. The division algorithm. The result of the application of the process 
ordinarily called long division to polynomials is a theorem which we shall 
call the Division Algorithm for polynomials and shall state as 

Theorem 1. Let f(x) and g(x) he polynomials of respective degrees n and m, 
g(x) 9 ^ 0. Then there exist unique polynomials q(x) and r(x) such that r(x) 
has virtual degree m — 1, q(x) is either zero or has degree n — m, and 

(3) f(x) = q(x)g(x) + r(x) . 

For let/(x) and g{x) be defined respectively by (1) and (2) with ho 0. 
Then, either n < m and we have (3) with q{x) = 0, r(x) = /(x), orao 0, 
n > m. If Ck is the virtual leading coefficient of a polynomial h{x) of virtual 
degree m + fc > m, a virtual degree of h{x) — bQ^f^c^gix) is m + fc — 1. 
Thus a virtual degree of/(x) — bo^af^^'^gix) is n •- 1, and a finite repetition 
of this process yields a polynomial r(x) = /(x) — bo^{aoX‘^'^ + . . .)g{x) of 
virtual degree w — 1, and hence (3) for q{x) of degree n — m and leading 
coefficient ajh^^ 7 ^ 0. If also /(x) = qQ{x)g{x) + ro(x) for ro(x) of virtual 
degree m — 1, then a virtual degree of 5(x) = ro(x) — r(x) is m — 1. But 
Lemma 1 states that if t{x) = q{x) — q^{x) 9 ^ 0 the degree of s(x) = 
i{x)g{x) is the sum of m and the degree of t{x). This is impossible; and 
t{x) = 0, q{x) = go(x), r(x) = ro(x). 

The Remainder Theorem of Algebra states that if we use the Division 
Algorithm to write 


/(x) = g(x)(x - c) + r(x), 
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so that g(x) = x — c has degree one and r = r(x) is necessarily a constant, 
then r = /(c). The obvious proof of this result is the use of the remark in 
the fifth paragraph of Section 1 to obtain/(c) = g(c)(c — c) + r, /(c) = r 
as desired. It is for this application that we made the remark. 

The Division Algorithm and Remainder Theorem imply the Factor Theorem 
—a result obtained and used frequently in the study of polynomial equa¬ 
tions. We shall leave the statements of that theorem, and the subsequent 
definitions and theorems on the roots and corresponding factorizations of 
polynomials* with real or complex coefficients, to the reader. 

* If f{x) is a polynomial in x and c is a constant such that /(c) = 0 then we shall 
call c a root not only of the equation /(x) = 0 but also of the polynomial /(x). 

EXERCISES 

1. Show by formal differentiation that if c is a root of multiplicity m of /(x) = 
(x — c)'^q{x) then c is a root of multiplicity m — 1 of the derivative/'(x) of /(x). 
What then is a necessary and sufficient condition that /(x) have multiple roots? 

2. Let c be a root of a polynomial/(x) of degree n and ordinary integral coeffi¬ 

cients. Use the Division Algorithm to show that any polynomial h{c) with rational 
coefficients may be expressed in the form 5o + hiC + . . • + for rational 

numbers * i ^n-i- Hint: Write h{x) = q{x)f{x) + r(x) and replace x by c. 

3. Let/(x) = x^ + 3x2 + 4 Ex. 2. Compute the corresponding bi for each of 
the polynomials 

o) c« + 10c4 + 25c2 c) c» - 2c* + 

6) c* + 4c3 + 6 c 2 + 4c + 1 d) (2c^ -f 3)(c^ + 3c) 

3. Polynomial divisibility. Let /(x) and ^(x) 0 be polynomials. Then 

by the statement that g(x) divides f(x) we mean that there exists a poly¬ 
nomial g(x) such that/(x) = q{x)g(x). Thus, g{x) 9 ^ 0 divides/(x) if and 
only if the polynomial r(x) of (3) is the zero polynomial, and we shall say 
in this case that f(x) has g(x) as a factor, g(x) is a factor of f(x). 

We shall call two nonzero polynomials/(x) and g{x) associated polynomials 
if /(x) divides g{x) and ^(x) divides /(x). Then /(x) = q{x)g{x), g{x) = 
h{x)f{x), so that/(x) = q(x)h{x)f{x). Applying Lemmas 3 and 2, we have 
q{x)h{x) = 1, q{x) and h{x) are nonzero constants. Thus f(x) andg(x) are 
associated if and only if each is a nonzero constant multiple of the other. 

It is clear that every nonzero polynomial is associated with a monic poly¬ 
nomial. Observe thus that the familiar process of dividing out the leading 
coefficient in a conditional equation /(x) = 0 is that used to replace this 
equation by the equation g{x) = 0 where g{x) is the monic polynomial 
associated with/(x). 
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Two associated monic 'polynomials are equal. We see from this that if 
g{x) divides fix) every polynomial associated with gix) divides fix) and 
that one possible way to distinguish a member of the set of all associates 
of gix) is to assume the associate to be monic. We shall use this property 
ater when we discuss the existence of a unique greatest common divisor 
(abbreviated, g.c.d.) of polynomials in x. 

In our discussion of the g.c.d. of poljmomials we shall obtain a property 
which may best be described in terms of the concept of rational function. 
It will thus be desirable to arrange our exposition so as to precede the study 
of greatest common divisors by a discussion of the elements of the theory of 
polynomials and rational functions of several variables, and we shall do so. 

EXERCISES 

1. Let / = fix) be a polynomial in x and define m(/) = x^f{l/x) for every posi¬ 
tive integer m. Show that m(/) is a polynomial in x of virtual degree m if and only 
if m is a virtual degree of fix), 

2. Show that m(0) = 0, m[m(/)] = /. 

3. Define/ = Oif/ = 0, and/ = n(/) if/is any nonzero polynomial of degree n. 
Show that m(/) = for every m>n and that, if / 5 ^ 0 , x is not a factor of /. 

4. Let be a factor of /. Prove that 5 is a factor of m(/) for every m which is at 
least the degree of /. 

4. Pol 3 niomials in several variables. Some of our results on polynomials 
in X may be extended easily to polynomials in several variables. We define 
a polynomial/ = /(xi, . . . , in xi, . . . , Xg to be any expression obtained 
as the result of a finite number of integral operations on xi, . . . , Xq and 
constants. As in Section 1 we may express /(xi, . . . , Xg) as the sum of a 
finite number of terms of the form 

(4) ax{‘ x^ . . . xj«. 

We call a the coefficient of the term (4) and define the virtual degree in 
Xi, . . . , Xg of such a term to be fci + . . . + fcg, the virtual degree of a par¬ 
ticular expression of/as a sum of terms of the form (4) to be the largest of 
the virtual degrees of its terms (4). If two terms of / have the same set of 
exponents fci, . . . , Ag, we may combine them by adding their coefficients 
and thus write / as the unique sum, that is, the sum with unique coefficients, 

/ = /(* 1 , . . . , X,) = ^ o*.. . . t,xj.. . . x*«. 

* 0 . 1 , . . . . 


(5) 
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Here the coefficients a*,. . . kq are constants and rij is the degree of /(xi, 

. . . j Xq) considered as a polynomial in X/ alone. Also / is the zero poly¬ 
nomial if and only if all its coefficients are zero. If / is a nonzero poly¬ 
nomial, then some ajt, ... ^^ 5 *^ 0 , and the degree of / is defined to be the 
maximum sum fci + . . . -f- for aA:,. . . 5 *^ 0 . As before we assign the 

degree minus infinity to the zero polynomial and have the property that 
nonzero constant polynomials have degree zero. Note now that a poly¬ 
nomial may have several different terms of the same degree and that con¬ 
sequently the usual definition of leading term and coefficient do not apply. 
However, some of the most important simple properties of polynomials in 
X hold also for polynomials in several x*, and we shall proceed to their 
derivation. 

We observe that a polynomial / in xi, . . . , x, may be regarded as a 
polynomial (1) of degree n = in x = x, with its coefficients Co, . . . , an all 
polynomials in Xi, . . . , x^-i and ao not zero. If, similarly, g be given by 
( 2 ) with &o not zero, then a virtual degree in Xq oifg is m + n, and a virtual 
leading coefficient of fg is a^h^. If g = 2, then Oo and fco are nonzero poly¬ 
nomials in xi and aobo 7 *^ 0 by Lemma 2 . Then we have proved that the 
product fg of two nonzero polynomials / and g in Xi, X 2 is not zero. If we 
prove similarly that the product of two nonzero polynomials in Xi,. . . , x,«i 
is not zero, we apply the proof above to obtain ao 6 o 9 ^ 0 and hence have 
proved that the product/^ of two nonzero polynomials in xi, . . . , Xq is not 
zero. We have thus completed the proof of 

Theorem 2. The prodtict of any two nonzero polynomials in Xi, . . . , Xq 
is not zero. 

We have the immediate consequence 

Theorem 3. Let f, g, h he polynomials in Xi, . . , , Xq and f he nonzero^ 
fg = fh. Then g = h. 

To continue our discussion we shall need to consider an important special 
type of polynomial. Thus we shall call/(xi, . . . , x^) a homogeneous poly- 
nomial or a /orm in Xi, . . . , Xq if all terms of (5) have the same degree 
k = ki + . . , + kq. Then, if / is given by (5) and we replace x* in (5) by 
yxij we see that each power product xj». . . xjff is replaced by y^^'^ • • • 

,,. Xq<i and thus that the polynomial/(i/xi,..., yxq) = y^f{x \,..., x^) identi¬ 
cally in t/, xi, . . . , Xq if and only if /(xi, . . . , x«) is a form of degree k 
in xi, . . . , Xq, 

The product of two forms / and g of respective degrees n and m in the 
same xi, . . . , x^ is clearly a form of degree m + n and, by Theorem 2, is 
nonzero if and only if / and g are nonzero. We now use this result to obtain 
the second of the properties we desire. It is a generalization of Lemma 1 . 
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Observe first that all the terms of the same degree in a nonzero poly¬ 
nomial (5) may be grouped together into a form of this degree and then 
we may express (5) uniquely as the sum 

(6) / = fi^iy . • . I = /o + . . . + /n , 

where/o is a nonzero form of the same degree n as the polynomial / and/< is 
a form of degree n — i. If also 

( 7 ) g = g{xu ... y x^) = gog^ y 

for forms gi of degree m i and such that gro 0, then clearly 

(8) /l^ = + . . • + Am+n , 


where the hi are forms of degree m + n — i and Ao = /ofi^o. By Theorem 2 
fto ^ 0. Thus if we call /o the leading form of /, we clearly have 

Theorem 4. Let f and g be polynomials m xi, . . . , Xq. Then the degree 
of fg is the sum of the degrees of f and g and the leading form of fg is the prod¬ 
uct of the leading forms of f and g. 

The result above is evidently fundamental for the study of polynomials 
in several variables—a study which we shall discuss only briefly in these 
pages. 


6. Rational functions. The integral operations together with the opera¬ 
tion of division by a nonzero quantity form a set of what are called the 
rational operations. A rational function of Xi, . . . , Xq is now defined to be 
any function obtained as the result of a finite number of rational operations 
on xiy y Xq and constants. The postulates of elementary algebra were 
seen by the reader in his earliest algebraic study to imply that every rational 
function of Xi, . . . , Xq may be expressed as a quotient 


(9) 


^ 0(Xl, . . . yXq) 

•' biXly . . , ,Xq)' 


for polynomials a(xi, , . . ,xf) and 6(xi, . . . , x^) 0. The coefficients of 

a{xij , , , y Xq) and 6(xi, . . . , x«) are then called coefficients of /. Let us ob¬ 
serve then that the set of all rational functions in xi, , , . , Xq with complex 
coefficients has a property which we describe by saying that the set is 
closed with respect to rational operations. By this we mean that every rational 
function of the elements in this set is in the set. This may be seen to be 
due to the definitions a/6 + c/d = {ad + bc)/bd, (a/6) • (c/d) = (ac)/(6d). 
Here 6 and d are necessarily not zero, and we may use Theorem 2 to obtain 
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bd 0. Observe, then, that the set of rational functions satisfies the prop¬ 
erties we assumed in Section 1 for our constants, that is, fg = Oii and only 
if / = 0 or g = 0, while if 0 then exists such that /= 1. 

6. A greatest common divisor process. The existence of a g.c.d. of two 
polynomials and the method of its computation are essential in the study 
of what arc called Sturm^s functions and so are well known to the reader 
who has studied the Theory of Equations. We shall repeat this material here 
because of its importance for algebraic theories. 

We define the g.c.d. of polynomials/i(x), . . . , /<(x) not all zero to be any 
monic 'polynomial d(x) 'which divides all the fi(x), and is such that if g(x) di¬ 
vides every fi(x) then g(x) divides d(x). If do(x) is a second such polynomial, 
then d{x) and doix) divide each other, d{x) and doix) are associated monic 
polynomials and are equal. Hence, according to our definition, the g.c.d. 
offi{x)y . . . yft(x) is a unique polynomial. 

If g{x) divides all the fi(x)y then g(x) divides d(x), and hence the degree 
of d(x) is at least that of g(x). Thus the g.c.d. d(x) is a common divisor 
of the fi(x) of largest possible degree and is clearly the unique monic com¬ 
mon divisor of this degree. 

If df(x) is the g.c.d. of/i(x), . . . , fj(x) and doix) is the g.c.d. of dj(x) and 
/y+i(x), then do(x) is the g.c.d. of fi{x), . . . For every common 

divisor h(x) of /i(x), . . . yfj^.i{x) divides/i(x), . . . ,/y(x), and hence both 
dj{x) and fj+i(x), h{x) divides d^{x). Moreover, d^ix) divides fi+i{x) and 
the divisor dj{x) offi{x)j . . . ,fj{x), do(x) divides/i(x), . . . ,//+i(a:). 

The result above evidently reduces the problems of the existence and 
construction of a g.c.d. of any number of polynomials in x not all zero to 
the case of two nonzero polynomials. We shall now study this latter prob¬ 
lem and state the result we shall prove as 

Theorem 6. Let f(x) and g(x) be polynomials not both zero. Then there 
exist polynomials a(x) and b(x) such that 

(10) d(x) = a(x)f(x) + b(x)g(x) 

is a monic common divisor of f(x) and g(x). Moreover, d(x) is then the unique 
g.c.d. of f(x) and g(x). 

Forif/(x) = 0, then d(a:) is associated with gf(a;), a(a;) = land6(a:) = b^^ 
is a solution of (10) if g(x) is given by (2). Hence, there is no loss of generaUty 
if we assume that both f{x) and g{x) are nonzero and that the degree of 
g{x) is not greater than the degree of f{x). For consistency of notation 
we put 

(11) ' ho(x) = fix) , hiix) = gix) . 
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By Theorem 1 

(12) ho{x) = qi{x)hi(jx) + h 2 {x) , 

where the degree of h 2 (x) is less than the degree of hi(x). If h 2 {x) 0, we 

may apply Theorem 1 to obtain 

(13) fti(x) = q 2 {x)h 2 {x) + hz{x ), 

where the degree of hs(x) is less than that of hix). Thus our division process 
yields a sequence of equations of the form 

(14) fet- 2 (x) = gi_i(x)Ai-i(x) + hi{x) , 

where if is the degree of hi(x) then ni > n 2 > . . . , while n* > 0 unless 
hi(x) = 0. We conclude that our sequence must terminate with 

(15) hr^zix) = qr-i{x)hr^i{x) + hr(x) 
and 

(16) hr{x) 9^ 0 , hr^i{x) = qr{x)hr{x) 

for r > 1. 

Equation (16) implies that (15) may be replaced by hr^zix) = 
[qr-.i(x)qr(x) + l]hr{x). Thus hr{x) divides both hr- 2 {x) and Ar-i(x). If we 
assume that hr{x) divides hi{x) and hi^i{x), then (14) implies that hr{x) 
divides hi^ 2 {x). An evident proof by induction shows that hr(x) divides 
both ho(x) = /(x) and Ai(x) = g{x). 

Equation (12) implies that ^ 2 ( 2 ^) = CL 2 {x)f{x) + b 2 {x)g{x) witha 2 (x) = 1, 
bzix) = —qi{x). Clearly also hi{x) = ai(x)/(x) + bi{x)g[x) with ai(x) = 0, 
6i(x) = 1. If, now, At-2(x) = ai^ 2 (x)f(x) + bi-. 2 {x)g(x) and ftt_i(x) = 
at-i(x)/(x) + 5 i-i(x)gf(x) then (14) implies that fti(x) = [a,_2(x) — 
g<-.i(x)a<->i(x)]/(x) + [6<^2 (x) - gi-i(x)6i-i(x)]gf(x) = a*(x)/(x) + bi{x)g{x). 
Thus we obtain hr(x) = ar(x)f(x) + br(x)g{x). The polynomial hr(x) is a 
common divisor of /(x) and g(x) and is associated with a monic common 
divisor c?(x) = chr{x). Then d(x) has the form (10) for a(x) — car(x),6(x) = 
cbrix). We have already shown that d(x) is unique. 

The process used above was first discovered by Euclid, who utilized it in 
his geometric formulation of the analogous result on the g.c.d. of integers. 
It is therefore usually called Euclid's process. We observe that it not only 
enables us to prove the existence of d(x) but gives us a finite process by 
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means of which d{x) may be computed. Notice finally that d{x) is computed 
by a repetition of the Division Algorithm onf{x), g{x) and polynomials se¬ 
cured from fix) and gix) as remainders in the application of the Division 
Algorithm, But this implies the result we state as 

Theorem 6 \.The polynomials a(x), b(x), and hence the greatest common 
divisor d(x) of Theorem 5 all have coefficients which are rational functions 
with rational number coefficients of the coefficients of f(x) and g(x). 

We thus have the 

Corollary. Let the coefficients of f(x) and g(x) be rational numbers. Then 
the coefficients of their g.c.d. are rational numbers. 

If the only common divisors of fix) and gix) are constants, then dix) = 1 
and we shall call fix) and gix) relatively prime polynomials. We shall also 
indicate this at times by saying that/(x) is prime to gix) and hence also 
that gix) is prime to fix). When fix) and gix) are relatively prime, we use 
(10) to obtain polynomials aix) and bix) such that 

(17) a{x)fix) + gix)bix) = 1 . 

It is interesting to observe that the polynomials aix) and bix) in (17) are 
not unique and that it is possible to define a certain unique pair and then 
determine all others in terms of this pair. To do this we first prove the 

Lemma 5. Let f(x), g(x), and h(x) be nonzero polynomials such that f(x) 
is prime to g(x) and divides g(x)h(x). Then f(x) divides h(x). 

For we may write gix)hix) = fix)qix) and use (17) to obtain [aix)fix) + 
bix)gix)]hix) = [aix)hix) + bix)qix)]fix) = hix) as desired. 

We now obtain 

Theorem 7, Let f(x) of degree n and g(x) of degree m be relatively prime. 
Then there exist unique polynomials ao(x) of degree at most m — 1 and bo(x) 
of degree at most n — 1 such that ao(x)f(x) + bo(x)g(x) = 1. Every pair of 
polynomials a(x) and b(x) satisfying (17) has the form 

(18) a(x) = ao(x) + c(x)g(x) , b(x) = bo(x) -'c(x)f(x) 
for a polynomial c(x). 

For, if aix) is any solution of (17), we apply Theorem 1 to obtain the 
first equation of (18) with a^ix) the remainder on division of aix) by gix). 
Then ao(a:) has degree at most m — 1, aix)fix) + bix)gix) = aoix)fix) + 
[bix) + cix)fix)]gix) = 1. We define boix) = bix) + cix)fix) and see that 
6o ix)gix) = — ao(x)/(a:) + 1 has degree at most m + n — 1. By Lemma 1 
the degree of boix) is at most n — 1, aoix)fix) + boix)gix) = 1 as desired. 
If now aiixl has virtual degree m — 1, biix) virtual degree n — 1 and 
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ai(x)/(x) + bi{x)gix) = ao(x)/(x) + bo{x)g(x)f then f(x) clearly divides 
[bo{x) — bi{x)]g{x). By Lemma 5 the polynomial bo{x) — bi{x) of virtual 
degree n — 1 is divisible by f{x) of degree n and must be zero. Hence, 
boix) = 6 i(x),sothatai(a;)/(x) = aQ(x)f(x)fai{x) = ao(a;). This proves ao(x) 
and bo{x) unique. But the definition above of ao{x) as the remainder on 
division of a{x) by g{x) shows that then (18) holds. 

There is also a result which is somewhat analogous to Theorem 7 for the 
case where/(a:) and g(x) are not relatively prime. We state it as 

Theorem 8. Let f(x) 5 ^ 0 and g(x) 9 ^ 0 have respective degrees n and m. 
Then polynomials a(x) 9 ^ 0 of degree at most m — 1 and b(x) 9 ^ 0 of degree 
at most n — 1 such that 

(19) a(x)f(x) + b(x)g(x) = 0 

exist if and only if f(x) and g(x) are not relatively prime. 

For if the g.c.d. of f{x) and g{x) is a nonconstant polynomial d{x), we 
have/(a;) = f\{x)d{x),g{x) = gJ^x)d{x)y gi{x)f{x) + [-fi{x)g{x)] = 0 where 
gi{x) has degree less than m and fi{x) has degree less than n. Conversely, 
let (19) hold. If f(x) and g{x) are relatively prime, we have ao{x)f{x) + 
bQ(x)g{x) = 1, a{x) = ao{x)a(x)f{x) + a{x)bo{x)g{x) = g{x)[a{x)bo{x) - 
ao(x)b{x)]. But then g{x) of degree m divides a(x) 0 of degree at most 
m — 1 which is impossible. 


EXERCISES 

1. Extend Theorems 5, 6, and the corollary to a set of polynomials/i(a:), . . . , 

Ux). 

2. Let/i(a;), . . . ,/«(a;) be all polynomials of the first degree. State their pos¬ 
sible g.c.d.^s and the conditions on the/»(a:) for each such possible g.c.d. 

3. State the results corresponding to those above for polynomials of virtual 
degree two. 

4. Prove that the g.c.d. off(x) and gix) is the monic polynomial of least possible 
degree of the form (10). Hint: Show that if d{x) is this polynomial then f(x) = 
q(x)d{x) + r(x)f r(x) has the form (10) as well as degree less than that of d(x) and 
so must be zero. 

5. A polynomial/(x) is called rationally irreducible if/(x) has rational coefficients 
and is not the product of two nonconstant polynomials with rational coefficients. 
What are the possible g.c.d.'s of a set of rationally irreducible/t(x) of Ex. 1? 

6. Let fix) = 0 be rationally irreducible, gix) have rational coefficients. Show 
that/(x) either divides gix) or is prime to gix). Thus, fix) is prime to gix) if the 
degree of gix) is less than that of fix). 
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7. Use Ex. 1 of Section 2 together with the results above to show that a rational¬ 
ly irreducible polynomial has no multiple roots. 

8 . Find the g.c.d. of each of the following sets of polynomials as well as of all 
possible pairs of polynomials in each case: 

a) /i = 2x^ — ~ 2x* — 6 x + 4 

/2 = + X® — X* — 2x -• 2 

b) /i = 3x^ + 8x2 - 3 

/z = X® + 2 x 2 ^ 0 

c) /i = x^ — 2x® — 2x2 — 2x — 3 
/2 = x* + 6 x 2 + 11 ^ + 0 
/a = x^ - 8x2 - 5x + 6 

d) /i = a:® + 4x2 + X — 6 
/2 = x2 - 3x + 2 
/a = X® — 3x* + x2 — 4x — 4 

9. Let /(x) be a rationally irreducible polynomial and c be a complex root of 
f(x) = 0. Show that, if ^(x) is a polynomial in X with rational coefficients and (/(c) 5 ^ 
0 , there then exists a polynomial h(x) of degree less than that of /(x) and with 
rational coefficients such that g(c)h(c) = 1 . 

10. Let/(x) be a rationally irreducible quadratic polynomial and c be a complex 
root of /(x) = 0. Show that every rational function of c with rational coefficients 
is uniquely expressible in the form a + be with a and b rational numbers. 

11 . Let/i, /the polynomials in x of virtual degree n and/i 9 ^ 0. Use Ex. 4 
of Section 3 to show that if d(x) is the g.c.d. of/i, . . . , /«then the g.c.d. of /i, . . . ,/< 
is d. Thus, show that the g.c.d. of n(/i),..., n(ft) has the form x*3 for an integer 

0 . 

7. Forms. A polynomial of degree n is frequently spoken of as an n-zc 
polynomial. The reader is already familiar with the terms linear^ quadratic, 
cubic, quartic, and quiniic polynomial in the respective cases n = 1,2,3,4,6. 

In a similar fashion a polynomial in xi, . . . , Xq is called a q-ary poly¬ 
nomial. As above, we specialize the terminology in the cases g = 1, 2, 3, 
4, 5 to be unary, binary, ternary, quaternary, and quinary. 

The terminology just described is used much more frequently in connec¬ 
tion with theorems on forms than in the study of arbitrary polynomials. 
In particular, we shall find that our principal interest is in n-ary quadratic 
forms. 

There are certain special forms which are quadratic in a set of variables 
x\, . . . , Xn,, j/i,. . . , and which have special importance because they 


e) /i = X® + 2 x^ — X® — 5 x 2 — 6 x — 3 
/2 = x® + a :2 + 3 x + 3 

/s = - X - 1 

/) /i = x® + 2 x 2 - 3 x - 6 
/2 = x 2 + X — 2 
/a = a:* + 2 x 2 + X + 2 
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are linear in both Xi, . . . , Xm and yi,, y^, separately. We shall call 
such forms bilinear forms. They may be expressed as forms 

i * 1,.. . , n 

(20) f= ^ Xiaayf, 

t ** 1.m 

SO that we may thus write 

n m 

(21) f = , a,- = '^Xiaa, 

j~i i-i 

and see that / may be regarded as a linear form in 2 / 1 , , 2 /n whose coeffi¬ 

cients are linear forms in xi, . . . , Xm- 

A bilinear form / is called symmetric if it is unaltered by the interchange 
of correspondingly labeled members of its two sets of variables. This state¬ 
ment clearly has meaning only if m = n; and / is symmetric if and only if 
/ = Xxiaijyj = 'LyiaijXj. But / = XyjaaXiy and hence / is symmetric if and 
only if m = n, 

(22) an = an (i, j = 1, ..., n). 

A quadratic form / is evidently a sum of terms of the type aix\ as well as 
the type caXiXf for i j. We may write an = a», a<y = a^ = |c*y for i ^ j 
and have CijXiXj = aijXiXj + ajiXfXi, so that 

n 

(23) / = XjanXj 

i, i = 1 

(utj = 0 ,]%] i) j 1, . . . , ?i) . 

We compare this with (22) and conclude that a quadratic form may be re¬ 
garded as the result of replacing the variables 2 / 1 , . . . , 2 /n in a symmetric 
bilinear form in a:i, . . . , Xn and , 2/n by Xi, . . . , Xny respectively. 

Later we shall obtain a theory of equivalence of quadratic forms and shall 
use the result just derived to obtain a parallel theory of symmetric bilinear 
forms. 

A final type of form of considerable interest is the skew bilinear form. 
Here again m = Uy and we call a bilinear form / skew if / = f(xiy . . . , Xni 
Vh • • • y Vn) = —fiViy • • • , ^/n/ . . . , Xn). Thus skow bilinear forms are 

forms of the type 

n 

/= ^Xianyi, 

i.i = l 


(24) 
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where 

(25) aji = -an (i, j = 1, . . . , n) . 

It follows that an + an = 0, that is 

(26) an = 0 (i = 1, . . . , n). 

Hence / is a sum of terms anixiVi — Xjyi) for i j, i = 1, . . . , n — 1, 
j = 2, . . . , n. It is also evident that if we replace the yj by corresponding 
Xjy then the new quadratic form/(a:i, . . . , Xn/ Xi, . . . , Xn) is the zero poly¬ 
nomial. It is important for the reader to observe thus that while (22) may 
be associated with both quadratic and symmetric bilinear forms we must 
associate (25) only with skew bilinear forms. 

ORAL EXERCISES 

1. Use the language above to describe the following forms: 

a) + 3x7j^ + d) x\ + 2x]yx 

b) x\ + y\ e) xiy 2 - x^yx 

c) 2xxyx + x^vx + Xxy 2 

2. Express the following quadratic forms as sums of the kind given by (23): 

a) 2xi + 3 xiX2 

b) xl — xl + 2xxXz + 2x1 ■“ 4x2Xa 

8. Linear forms. A linear form is expressible as a sum 

(27) / = UiXi + . . . + CLnXn • 

We shall call (27) a linear combination of Xi, . . . , Xn with coefficients 
Ui, . . . , On. The concept of linear combination has already been used with¬ 
out the name in several instances. Thus any polynomial in x is a linear 
combination of a finite number of non-negative integral powers of x with 
constant coefficients, a polynomial in xi, . . . , Xg is a linear combination 
of a finite number of power products xft. . . xj? with constant coefficients, 
the g.c.d. of/(x) and ^(x) is a linear combination (10) of/(x) and g{x) with 
polynomials in x as coefficients. 

The form (27) with ai = a 2 = ...= an = 0 is the zero form. If g is 
a second form, 

(28) g = bixi + . . . + bnXn , 
with constant coefficients 6i, . . . , 5n, we see that 

(29) ^ f + g (ai + bi)xi + . . . + (an + bn)Xn . 
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Also if c is any constant, we have 

(30) cf = (cai)xi + . . . + (can)xn . 

We define —/ to be the form such that / + (—/) = 0 and see that 

(31) -/ = -1 • / = (~ai)a:i + . . . + (-an)xn . 

Then f = g if and only if / — fif = / + (—g) = 0, that is ai = bi (i = 1, 

• • • f ^)* 

The properties just set down are only trivial consequences of the usual 
properties of polynomials and, as such, may seem to be relatively unimpor¬ 
tant. They may be formulated abstractly, however, as properties of se¬ 
quences of constants (which may be thought of, if we so desire, as the coeffi¬ 
cients of linear forms) and in this formulation in Chapter IV will be very 
important for all algebraic theory. The reader is already familiar with these 
properties which he has used in the computation of determinants by opera¬ 
tions on its rows and columns. 

Let, then, w be a sequence 

(32) u = (ai, . . . , an) 

of n constants a,- called the elements of the sequence u. If a is any constant, 
we define 

(33) au = ua = (oai, . . . , aon) 

and call au the scalar product of u by a. We now consider a second sequence, 

(34) t; = (6i, . . . , 6n), 
and define the sum of u and v by 

(35) w + i; = (ai + 6i, . . . , Un + 5n) . 

Then the linear combination 

(36) au bv = (aui H- bbi, . . . > aan 55n) 

has been uniquely defined for all constants a and b and all sequences 
u and V, 

The sequence all of whose elements are zero will be called the zero 
sequence and designated by 0. It is clearly the unique sequence z with the 
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property that w + 2 = u for every sequence u. Evidently if a is the zero 
constant au = 0 for every u. 

We define .the negative — m of a sequence u to be the sequence v such 
that u + t; == 0. Evidently, then, — u is the unique sequence 

(37) = -1 • w = (-Oi, . . . , -fln) , 

and we see that the unique solution of the equation u + x = 1 ; is the se¬ 
quence 1 ; + (—u). We evidently call this sequence 

(38) t; - w = (61 - ai, . . . , bn - an). 

The reader should observe now that the definitions and properties derived 
for linear combinations of sequences are precisely those which hold for the 
sequences of coefficients of corresponding linear combinations of linear 
forms and that the usual laws of algebra for addition and multiplication 
hold for addition of sequences and multiplication of sequences by constants. 

9. Equivalence of forms. If / = /(xi, . . . , x,) is a form of degree n in 
xi, . . . , Xg and we replace every Xi in / by a corresponding linear form 

(39) Xi = auyi + . . . + airyr (i = 1, . . . , g), 

we obtain a form g = g(yi, . . . , yr) of the same degree w in 2 / 1 , . . . , t/r. 
Then we shall say that / is carried into g (or that g is obtained from /) by 
the linear mapping (39). If g = r and the determinant 


(40) 


an 

aj2 . . 

Oij 

dql 

aq2 . . . 



is not zero, we shall say that (39) is nonsingular. In this case it is easily 
seen that we may solve (39) for 2 / 1 , . . . , 2 /® as linear forms in xi, . . . , Xq and 
obtain a linear mapping which we may call the inverse of (39). This termi¬ 
nology is justified by the fact that the equation /(xi, . . . , x^) = g(^i, 
. . . , 2 /tf) is an identity, and thus if we replace 2 / 1 , . . . , 2 /^ g{yh • • • iVq) 
by the corresponding linear forms in xi, . . . , Xq we obtain the original form 
f(xi, 

We now consider two forms / = f{xi, and g = gixi ,. . . , a;,) of 

the same degree n. Then we shall say that / is equivalent to if / is carried 
into g{yi, . . . , j/,) by a nonsingular linear mapping. The statements above 
imply that if /,is equivalent to g then g is also equivalent to /. Thus, we 
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shall usually say simply that/ and g are equivalent. We shall not study the 
equivalence of forms of arbitrary degree but only of the special kinds of 
forms described in Section 7, and even of those forms only under restricted 
types of linear mappings. 

We have now obtained the background needed for a clear understanding 
of matrix theory and shall proceed to its development. 

EXERCISES 

1 . The linear mapping (39) of the form x* == 2 /i for i = 1, . . ., $ is called the 
identical mapping. What is its effect on any form ff 

2 . Apply a nonsingular linear mapping to carry each of the following forms to an 
expression of the type aiyj + a^yl, IDnt: Write / = a\{xx + 0 x 2 )* + .-. . by com¬ 
pleting the square on the term in x\ and put xi + cx^ = 1 / 1 , Xi = ^ 2 . 

a) 2x1 — ^XiX 2 + Zxi d) 2x1 — XiXz 

b) xl + 14 x 1 X 2 + 9x1 e) 3x1 + 2 x 1 X 2 — xi 

c) 3x? + 18 x 1 X 2 + 24x1 

3. Find the inverses of the following linear mappings: 

V f 2 xi + X 2 == f Xi + X 2 = 2/1 

\3xi + 2 x 2 = 2/2 I •”2xi + X 2 = 2/2 

4. Apply the linear mappings of Ex. 3 to the following forms / tp obtain equiva¬ 
lent forms g and their inverses to g to obtain /. 

o) / = xj + xl h) / = 4xf — 4 xiX 2 + 3x1 
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RECTANGULAR MATRICES AND ELEMENTARY 
TRANSFORMATIONS 

1. The matrix of a system of linear equations. The concept of a rec¬ 
tangular matrix may be thought of as arising first in connection with the 
study of the solution of a system 


U 112/1 + ^122/2 + . 

• • ”1“ ainyn — ki f 

I U 212 /I + ^222/2 + . 

. . + U2n2/n = k2 , 

^ amlVl + CLin2y2 + • 

• • “f* amnyn — km 


of m linear equations in n unknowns 2 / 1 , , i/n, with constant coefficients 

Ui/. The array of coefficients arranged as they occur in (1) has the form 



/ ^11 

ai2 . . . 

am \ 

(2) 

II 

• 

a22 • • . 

02n 1 


\ ami 

am2 . . . 

amn J 


and is called the coefficient matrix of the system (1). We shall henceforth 
speak of the coefficients a^ and ki in (1) as scalars and shall derive our theo¬ 
rems with the understanding that they are constants (with respect to the 
variables 2/1, , 2 /n) according to the usual definitions and hence satisfy 

the properties usually assumed in algebra for rational operations. In a later 
chapter we shall make a completely explicit statement about the nature 
of these quantities. 

It is not only true that the concept of a matrix arises as above in the 
study of systems of linear equations, but many matrix properties are ob¬ 
tainable by observing the effect, on the matrix of a system, of certain natu¬ 
ral manipulations on the equations themselves with which the reader is 
very familiar. We shall devote this beginning chapter on matrices to that 
study. 

Let us now recall some terminology with which the reader is undoubtedly 
familiar. The line 

(3) , Ui = (Uii, ... 9 a>in) 
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of coefficients in the tth equation of (1) occurs in (2) as its ^th horizontal 
line. Thus, it is natural to call w,- the ith row of the matrix A. Similarly, 
the coefficients of the unknowns yj in (1) form a vertical line 


(4) 


aii 


««) = 


\ ^mj 


which we call the jth. column of A. 

We may now speak of A as a matrix of m rows and n columns, as an 
m-rowed and n-columned matrix or, briefly, as m hy n matrix. Then 
the rows of A are 1 by n matrices and its columns are m by 1 matrices. We 
shall speak of the scalars an as the elements of A, and they may be regarded 
as one by one matrices. The notation which we adopt for the element 
of A in its ith row and jth column will be used consistently, and this usage 
will be of some importance in the clarity of our exposition. To avoid bulky 
displayed equations we shall usually not use the notation (2) for a matrix 
but shall write instead 

(5) A = (a<,) {i = 1, ... ,m;j = 1, ... ,n) . 

If m = n then A is a square matrixj and we shall speak of A simply as an 
m-rowed square matrix. This, too, is a concept and terminology which we 
shall use very frequently. 


ORAL EXERCISES 

1. Read off the elements an, ai 2 , 024 , O 23 in the following matrices 


a) 


1 

-2 

0 

3\ 

/ ^ 

0 

-2 


3 

4 

-1 

o| 


0 

1 

A 

1 

2 

3 

ll 

3 

4 

1 

3 

4 

1 

2/ 

\ 5 

1 

6 

7 / 


c) 


4 0 5 

0 -1 ~2 -3 
3 14 7 

6 0 -1 sy 


. / 3 2 4 5\ 

[-1 1 6 -e) 


2. Read off the second row and the third column in each of the matrices of Ex. 1. 

3. Read off the systems of equations (1) with constants ki all zero and matrices 
of coefficients as in Ex. 1. 
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2. Submatrices. In solving the system (1) by the usual methods the 
reader is led to study subsystems of s < m equations in certain t < n ol 
the unknowns. The corresponding coefficient matrix has s rows and t col¬ 
umns, and its elements lie in certain s of the rows and t of the columns of A, 
We call such a matrix an s by i submatrix B oi A, 11 s < m and t < n, 
the elements in the remaining m — s rows and n — t columns form an 
m — shy n — t submatrix C of A and we shall call C the complementary 
submatrix of B. Clearly, then, B is the complementary submatrix of C. 

It will be desirable from time to time to regard a matrix as being made 
up of certain of its submatrices. Thus we write 

(6) A = {Aij) (f = 1, . . . , s; j = 1, . . ., t) f 

where now the symbols An themselves represent rectangular matrices. We 
assume that for any fixed i the matrices A a, At2,.. . , A*, all have the same 
number of rows, and for fixed k the matrices Au, A2fc, . . . , A,*, have the 
same number of columns. It is then clear how each row of A is a 1 by < 
matrix whose elements are rows of Aa, . . . , A« in adjacent positions and 
similarly for columns. We have thus accomplished what we shall call the 
partitioning (6) of A by what amounts to drawing lines mentally parallel to 
the rows and columns of A and between them and designating the arrays 
of elements in the smallest rectangles so formed by A<y. Our principal use 
of (6) will be the use of the case where we shall regard A as a two by two 
matrix 



whose elements Ai, A2, As, A4 are themselves rectangular matrices. Then 
Ai and A2 have the same number of rows, A3 and A4 have the same num¬ 
ber of rows, and every row of A consists partially of a row of Ai and of a 
corresponding row of A2 or of a row of A3 and a corresponding row of A4. 

Note our usage in (2), (5), (6), (7) of the symbol of equality for matrices. 
We shall always mean that two matrices are equal if and only if they are 
identical, that is, have the same size and equal corresponding elements. 

EXERCISES 

1. State how the columns of A of (7) are connected with the columns of Ai, A2, 
Aa, and A4. 

2. Introduce a notation of an arbitrary six-rowed square matrix A and partition 
A into a three-rowed square matrix whose elements are two-rowed square matrices. 
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Also partition A into a two-rowed square matrix whose elements are three-rowed 
square matrices. 

3. Write out oil submatrices of the matrix 

/2 -1 3 4 5\ 

(1 0 2 -1 - 2 \ 

lO 1 2 3 61 

\7 3 2 1 0/ 

and, if they exist, the complementary submatrices. 

4. Which of the submatrices in Ex. 3 occur in some partitioning of A as a matrix 
of submatrices? 

5. Partition the following matrices so that they become three-rowed square mat¬ 
rices whose elements are two-rowed square matrices and state the results in the 
notation (6). 


/2 

-1 

3 

4 

1 

2) 


1 

-1 

0 

0 

1 


0 

1 

1 

-1 

2 

i| 


' 1 

1 

0 

0 

0 

1 

1 

-2 

1 

0 

0 

0 

b) 

-2 

0 

2 

0 

3 

1 

0 

0 

1 

1 

-1 

3 1 

0 

2 

0 

2 

-1 

2 

0 

0 

4 

2 

1 

0 


1 ® 

0 

-1 

1 

-4 

0 

,0 

0 

-1 

3 

0 

1 


\ 0 

0 

-1 

-1 

0 

- 4 / 


6 . Partition the matrices of Ex. 5 into two-rowed square matrices whose elements 
are three-rowed square matrices. 

7. Partition the matrices of Ex. 6 into the form (7) such that Ai is a two by three 
matrix; a one by six matrix; a two by two matrix. Read off A 2 , A 3 , and A 4 and 
state their sizes. 

3, Transposition. The theory of determinants arose in connection with 
the solution of the system (1). The reader will recall that many of the prop¬ 
erties of determinants were only proved as properties of the rows of a de¬ 
terminant, and then the corresponding column properties were merely 
stated as results obtained by the process of interchanging rows and columns. 
We call the induced process transposition and define it as follows for mat¬ 
rices. Let A be an m by n matrix, a notation for which is given by (5), 
and define the matrix 

(8) A' = (sf,i = an) j = 1, . . . , n; i = 1, . . . , m) , 

which we shall call the transpose of A. It is an ra by m matrix obtained from 
A by interchanging its rows and columns. Thus, the element at,- in the ith 
row and jth column of A occurs in A' as the element in its jth row and ith 
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column. Note then that in accordance with our conventions (8) could have 
been written as 

(9) A = 0 “ 1, . . . , Ti; t = 1, . . . , m ). 

We also observe the evident theorem which we state simply as 

(10) (A'Y = A . 

The operation of transposition may be regarded as the result of a certain 
rigid motion of the matrix which we shall now describe. If A is our m by n 
matrix and m < n we put q = ni and write 

(11) A = (Ai, A 2 ) , 

where Ai is a (/'-rowed square matrix. On the other hand, if n < m, we put 
q = n and have 

(12) A - (^;), 

where the matrix Ai is again a g-rowed square matrix. The line of elements 
Uii, U 22 , . . . , aqqof A and hence of Ai is called the principal diagonal of A 
or, simply, the diagonal of A. It is a diagonal of the square matrix Ai and 
is its principal diagonal. We shall call the an the diagonal elements of A. 
Notice now that A' is obtained from A by using the diagonal of A as an 
axis for a rigid rotation of A so that each row of A becomes a column of A'. 
We should also observe that if A has been partitioned so that it has the 
form (6) then A' is the ^ by 5 matrix of matrices given by 

(13) A = (Gyi) ((?y» = Aij] j = 1, . . . , i = 1, . . . , s). 

We have now given some simple concepts in the theory of matrices and 
shall pass on to a study of certain fundamental operations. 

EXERCISES 

1. Let A have the form (7). Give the corresponding notation for A'. Give also 
A' if A is any matrix of Ex. 1 of Section 1. 

2. In Ex. 1 assume that A = A'. What then is the form (7) of A ? Obtain the 
analogous result if A = —A', where — A is the matrix whose elements are the nega¬ 
tives of those of A. 

3. Let A be a three-rowed square matrix. Find the form (2) of A if A = A' and 
also if A = —A^* 
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4. Prove that the determinant of every three-rowed square matrix A with the 
property that A* — —A is zero. 

5. Solve the system (1) with matrix 



3 

for 2 / 1 , y^i 2/8 in terms of /bi, fe, h. Write the results as 2/i = bijkj and thus com- 

y-i 

pute the matrix B = (&»,)• Do this also for the system (1) with matrix A' and com¬ 
pare the results. 

4. Elementary transformations. The system (1) may be solved by the 
method of elimination^ and the reader is familiar with the operations on 
equations which are permitted in this method and which yield systems 
said to be equivalent to (1). The resulting operations on the rows of the 
matrix A of the system and corresponding operations on the columns of A 
are called elementary transformations on A and will turn out to be very useful 
tools in the theory of matrices. 

The first of our transformations is the result on the rows of A of the inter¬ 
change of two equations of the defining system. We define this and the 
corresponding column transformation in the 

Definition 1. Let i r and B be the matrix obtained from A by inter- 
changing its ith and rih rows (columns). Then B is said to be obtained from A 
by an elementary row (column) transformation of type 1, 

The rows (columns) of an m by n matrix are sequences of n (of m) ele¬ 
ments, and the operations of addition and scalar multiplication (i.e., mul¬ 
tiplication by a scalar) of such sequences were defined in Section 1.8.* The 
left members of (1) are linear forms. The addition of a scalar multiple of 
one equation of (1) to another results in the addition of a corresponding 
multiple of a corresponding linear form to another and hence to a corre¬ 
sponding result on the rows of A. Thus we make the following 

Definition 2. Let i and r he distinct integers, c be a scalar, and B be the 
matrix obtained by the addition to the ith row (column) of A of the multiple 
by c of its Tth row (column). Then B is said to be obtained from A by an ele¬ 
mentary row (column) transformation of type 2, 

Our final type of transformation is induced by the multiplication of an 

* We shall use a corresponding notation henceforth when we make references any¬ 
where in our text to results in previous chapters. Thus, for example, by Section 4.7, 
Theorem 4.8, Lemma 4.9, equation (4.10) we shall mean Section 7, Theorem 8, Lemma 9, 
equation (10) in Chapter IV. However, if the prefix is omitted, as, for example. Theorem 8, 
we shall mean that theorem of the chapter in which the reference is made. 
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equation of the system (1) by a nonzero scalar o. The restriction a 9^ Q 
is made so that A will be obtainable from B by the like transformation 
for ar^. Later we shall discuss matrices whose elements are polynomials 
in X and use elementary transformations with polynomial scalars a. We 
shall then evidently require a to be a polynomial with a polynomial inverse 
and hence to be a constant not zero. In view of this fact we shall phrase 
the definition in our present environment so as to be usable in this other 
situation and hence state it as 

Definition 3. Let the scalar a possess an inverse and the mairix B he 
obtained as the result of the multiplication of the ith row (column) of A by a. 
Then B is said to be obtained from A by an elementary row (column*) trans¬ 
formation of type 3, 

The fundamental theorems in the theory of matrices are connected with 
the study of the matrices obtained from a given matrix A by the applica¬ 
tion of a finite sequence of elementary transformations, restricted by the 
particular results desired, to A, Thus, it is of basic importance to study 
first what occurs if we make no restriction whatever on the elementary 
transformations allowed. For convenience in our discussion we first make 
the 

Definition. Let A and Bbembyn matrices and let B be obtainable from A 
by the successive application of finitely many arbitrary elementary transforma¬ 
tions. Then we shall say that A is rationally equivalent to B and indicate this 
by writing A ^ B. 

We now observe some simple consequences of our definition. First, we 
see that, if A is rationally equivalent to B and B is rationally equivalent 
to C, the combination of the elementary transformations which carry AtoB 
with those which carry B to C will carry A to C. Then A is rationally 
equivalent to C. Observe next that every m by n matrix A is rationally 
equivalent to itself. For the elementary transformations of type 2 with 
c == 0 and of type 3 with a = 1 are identical transformations leaving all 
matrices unaltered. 

Finally, we see that if an elementary transformation carries Ato B there 
is an inverse transformation of the same type carrying Bio A. In fact, the 
inverse of any transformation of type 2 defined for c is that defined for — c, 
of type 3 defined by a is that defined for of type 1 is itself. But then A 
is rationally equivalent to B if and only if B is rationally equivalent to A. 

* The reader should verify the fact that, if we apply any elementary row transforma¬ 
tion to A and then any column transformation to the result, the matrix obtained is the 
same as that which we obtain by applying first the column transformation and then the 
row transformation. 
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Thus we may and shall replace the terminology A is rationally equivalent 
to B in the definition above by A and B are rationally equivalent 

We have now shown that in order to prove that A and B are rationally 
equivalent it suffices to prove A and B both rationally equivalent to the 
same matrix C, As a tool in such proofs we then prove the following 
Lemma 1. Le< r < m, s < n, and A and Bbemby n matrices of the form 



for r by B ralionally equivalent matrices Ai and Bi and m — r by n — b ra¬ 
tionally equivalent matrices A% and B 2 . Then A and B are rationally equivalent. 
For it is clear that any elementary transformation on the first r rows and s 
columns of A induces a corresponding transformation on A\ and leaves A 2 
and the zero matrices bordering it above unaltered. Clearly the sequence 
of such transformations induced by the transformations carrying A\io B\ 
will replace A by the matrix 



We similarly follow this sequence of elementary transformations by ele¬ 
mentary transformations on the last m ~ r rows and n — s columns of Ao 
which carry A 2 to B 2 and obtain JB. 

It is important also to observe that we may arbitrarily permute the rows 
of A by a sequence of elementary row transformations of type 2, and simi¬ 
larly we may permute its columns. For any permutation results from some 
properly chosen sequence of interchanges. 

Before continuing further with the study of rational equivalence we shall 
introduce the familiar properties of determinants in the language of matrix 
theory and shall also define some important special types of matrices. We 
shall then discuss another result used for the types of proofs mentioned 
above. 

6. Determinants. Let B be the square matrix 
(14) B = (i,i = 

The corresponding symbol 




hn . 

. 

(15) 

D = 

621 . 




bn . . 
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is called a t-rowed determinant or determinant of order t It is defined as the 
sum of the t! terms of the form 

(16) 

where the sequence of subscripts ti, . . . , it ranges over all permutations 
of 1 , 2, . . . , < and the permutation . . . ,it may be carried into 1 , 2 ,..., 
t by i interchanges. That the sign (“l)Ms unique is proved in L. E..Dick- 
son^s First Course in the Theory of Equations^ and we shall assume this re¬ 
sult as well as all the consequent properties of determinants derived there. 

The determinant D will be spoken of here as the determinant of the matrix 
B and we shall indicate this by writing 

(17) D = \B\ 

(read D equals determinant B), Nonsquare matrices A do not have de¬ 
terminants, but their square submatrices have determinants called the 
minors of A. If A is a square matrix of n > ^ rows, the complementary sub¬ 
matrix of any ^-rowed square submatrix B is an (n — /)-rowed square ma¬ 
trix whose determinant and that of B are minors of A called complementary 
minors. In particular, every element a<,* of a matrix A defines a one-rowed 
square submatrix of A whose determinant is the element itself. Thus we 
have seen that the elements of a matrix may be regarded either as its one-rowed 
square submatrices or as its one-rowed minors. We now pass to a statement 
of some of the most important results on determinants. 

The result on the interchange of rows and columns of determinants men¬ 
tioned in Section 3 may now be stated as 
Lemma 2. Let Abe a square matrix. Then 

(18) lA'l = |A| . 

The next three properties of determinants are those frequently used in 
the computation of determinants, and we shall state them now in the lan¬ 
guage we have just introduced. 

Lemma 3. Let B he the matrix obtained from a square matrix A by an ele¬ 
mentary transformation of type 1, Then |B| == — |A|. 

Lemma 4. Let B be the matrix obtained from a square matrix A by an ele¬ 
mentary transformation of type 2. Then | B | = | A |. 

Lemma 5. Let B be the matrix obtained from a square matrix A by an ele¬ 
mentary transformation of type 3 defined for a scalar a. Then [ B | = a • | A |. 

The reader will recall that Lemma 3 may be used in a simple fashion 
to obtain 
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Lemma 6. If a square matrix has two equal rows or columns, its determi¬ 
nant is zero. 

Another result of this t 3 rpe is 

Lemma 7. If a square matrix has a zero row or column, its determinant is 
zero. 

Finally, we have 

Lemma 8. Let A^ B, C &6 n-rowed square matrices such that the ith row 
{column) of C is the sum of the ith row {column) of A and that of B while all 
other rows {columns) of B and C are the same as the corresponding rows {col¬ 
umns) of A. Then 

(19) |C| = |A| + |B|. 

There are, of course, many other properties of determinants, and of these 
we shall use only very few. Those we shall use are, of course, also well 
known to the reader. Of particular importance is that result which might 
be used to define determinants by an induction on order and which does 
yield the actual process ordinarily used in the expansion of a determinant. 
We let A be an n-rowed square matrix A « (a<,) and define d*/ to be the 
complementary minor of a</. Then the result we refer to states that if we 
define c,< == (—then 

n n 

(20) IAI = UikCki = OjkOkj (i, j, = 1, . . . , n) . 

k^l ib-l 

Thus, the determinant of A is obtainable as the sum of the products of the 
elements an in any row (column) of A by their cofactors c,<, that is, the 
properly signed and labeled minors (—l)*+^di,. 

The result (20) is of fundamental importance in our theory of matrices 
and will be applied presently together with the following parallel result. 
Let B be the matrix obtained from a square matrix A by replacing the ith 
row of A by its gth row. Then B has two equal rows and by Lemma 6 
IBI =0. We expand B as above according to the elements of its ith row 
and obtain as its vanishing determinant the sum of the products of all ele¬ 
ments in the gth row of A by the cofactors of the elements in the ith row 
of A. Combining this result with the corresponding property about col¬ 
umns we have 

n ft 

Ctkf^ki ” 0 

(i j; i, j, q, s = 1,. . ., n). 


(21) 
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The equations (20) and (21) exhibit certain relations between the arbitrary 
square matrix A and the matrix we define as 

(22) adj A = (c*;) (t, j = 1, . . ., n) . 


These relations will have important later consequences. We call the matrix 
(22) the adjoint of A and see that if A — (a,/) is an n-rowed square matrix 
its adjoint is the n-rowed square matrix with the cofactor of the element 
which appears in the jth row and ith column of A as the element in its 
own ith row and jth column. Clearly, if A = 0 then adj A = 0. 


EXERCISES 

1. Compute the adjoint of each of the matrices 


a) 





0 -1 
2 1 
6 3 


d) 1 1 
\2 3 



2. Expand the determinants below and verify the following instances of Lem¬ 
ma 8: 



3 

2 

-1 


3 

-3 

-1 


3 

-1 

-1 

o) 

1 

2 

0 

+ 

1 

-1 

0 


1 

1 

0 


0 

-1 

3 


0 

0 

3 


0 

-1 

3 


1 

-1 

4 



1 

-3 


0 

0 

1 

h) 

1 

-1 

1 

+ 

1 

-1 

1 

= 

1 

-1 

1 


2 

3 

4 


2 

3 

4 


2 

3 

4 


6, Special matrices. There are certain square matrices which have spe¬ 
cial forms but which occur so frequently in the theory of matrices that they 
have been given special names. The most general of these is the triangular 
matrix, that is, a square matrix having the property that either all its ele¬ 
ments to the right or all to the left of its diagonal are zero. Thus a square 
matrix A = (a*,) is triangular if it is true that either a*,- =.0 for all j > t or 
that a<y == 0 for all j < i. It is clear that A is triangular if and only if A' is 
triangular; and, moreover, we have 

Theorem 1. The determinant of a triangular matrix is the product 
. . . ann of its diagonal elements. 

The result above is clearly true if n = 1 so that A = (an), \A\ = an. 
We assume it true for square matrices of order n — 1 and complete our 
induction by expanding | A1 according to the elements of its first row or 
first column in the respective cases above. 

A matrix A = (a^y) is called a diagonal matrix if it is a square matrix. 
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and an = 0 for all i 7 ^ j. Clearly, a diagonal matrix A is triangular so 
that its determinant is the product of its diagonal elements. 

If all the diagonal elements of a diagonal matrix are equal, we call the 
matrix a scalar matrix and have | A | = ajj. The scalar matrix for which 
ail = 1 is called the n-roi^ed identity matrix and will usually be designated 
by I. If there be some question as to the order of I or if we are discussing 
several identity matrices of different orders, we shall indicate the order by 
a subscript and thus shall write either In or I as is convenient for the n- 
rowed identity matrix. 

Any scalar matrix may be indicated by 

(23) al , 

where a = an is the common value of the diagonal elements of the matrix. 
We shall discuss the implications of this notation later. 

It is natural to call any m by n matrix all of whose elements are zeros 
a zero matrix. In any discussion of matrices we shall use the notation 0 to 
represent not only the number zero but any zero matrix. The reader will 
find that this usage will cause neither difficulty nor confusion. 

We shall frequently feel it desirable to consider square matrices of either 
of the forms 



where Ai is a square matrix. Then (24) implies that A 4 is necessarily square, 
and the reader should verify the fact that the Laplace expansion of de¬ 
terminants implies that 

\A\ = \A,\ = Mil . \A,\ . 

The property above and that of Theorem 1 are special instances of a 
more general situation. We let A be a square matrix and partition it as in 
(6) with s ^ t and the submatrices A a all square matrices. Then the La¬ 
place expansion clearly implies that if all the An are zero matrices for either 
all i > j or all i < j, then |A | = |An| . . . |A«|. Evidently Theorem 1 
is the case where the An are one-rowed square matrices and the result con¬ 
sidered in (24) the case where t = 2. 

In connection with the discussion just completed we shall define a nota¬ 
tion which is quite useful. Let A be a square matrix partitioned as in (6) 
and suppose that s — t, the An are all square matrices, and every An — 0 
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for i 7 ^ j. Then A is composed of zero matrices and matrices Ai ^ An 
which are what we may call its diagonal blocks^ and we shall indicate this 
by writing 

(25) A = diag {Ai,. . ., A|} . 

As above, the determinant of A is the product of the determinants of its 
submatrices Ai, . . . , Ai. 

In closing we note the following result which we referred to at the close 
of Section 4. 

Lemma 9. Every nonzero mhy n matrix A is rationally equivalent to a 
matrix 

6 1 ) ■ 

where I is an identity matrix. 

For by elementary transformations of type 1 we may carry any element 
apq 0 of A into the element bn of a matrix B which is rationally equivalent 
to A. By an elementary transformation of type 3 defined for a = we 
replace B by a rationally equivalent matrix C with Cn = 1. We then apply 
elementary row transformations of type 2 with c = — Cri to replace C by 
the rationally equivalent matrix D = (d*,) such that dn = 1, dri == 0 for 
r > 1, and then use elementary column transformations of type 2 with 
c = —dir to replace D by the matrix 

w - (j‘ y • 

Now Ai is an m — 1 by n — 1 matrix, and h is the identity matrix of one 
row. Clearly, A and Ao are rationally equivalent. Moreover, either Ai = 0 
and we have (26) for I == /i, or our proof shows that Ai is rationally equiv« 
alent to a matrix 



But then, by Lemma 1, A is rationally equivalent to a matrix 



(29) 
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After finitely many such steps we obtain (26). 

We shall show later that the number of rows in the matrix I of (26) is 
uniquely determined by A. 


EXERCISES 

1. Cany the following matrices into rationally equivalent matrices of the 
form (26) 

/ 3 • 5 10 4\ 


o) 


0 1 2 1 
k -1 - 1-2 0 , 


/I 1 2\ 
6) 2 3 5) 

Vl 2 3/ 


c) 


/I 0 1 2\ 
( 2023 ) 
0 0 0 0 
\l 0 1 1/ 


/-I 1 1 1\ 

/ 3 -3 -3 -3 I 

1 -2 2 2 2 
\ 0 0 1 0 / 


2. Apply elementary row transformations only and carry each of the following 
matrices into a matrix of the form (26) 



3. Apply elementary column transformations only and carry each of the follow¬ 
ing matrices into a matrix of the form (26). 


a) 



/O -2 
1 5 
1 0 
\l 0 


1 0 \ 
-5 0\ 

-3 01 

-1 - 1 / 



1 0 3 -1\ 
-1 2 4 5/ 



0 10 2 
0 5 13 


) 


4. Show that if the determinant of a square matrix A is not zero then A can be 
carried into the identity matrix by elementary row* transformations alone. Hint: 
The property | A | 0 is preserved by elementary transformations. Some element 
in the first column of A must not be zero, and by row transformations we may 
carry A into a matrix with ones on the diagonal and zeros below and then into 7. 


7. Rational equivalence of rectangular matrices. The largest order of any 
nonvanishing minor of a rectangular matrix A is called the rank of A. The 
result of (20) states that every (t + l)-rowed minor of A is a sum of nu- 

♦ Not every matrix may be carried into the form (26) by row transformations only, 
e.g., Uke A (1 1). 
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merical multiples of its i-rowed minors, and if the latter are all zero so are 
the former. We thus clearly have 

Lemma 10. Let all (r + lyrowed minors of A vanish. Then the rank of A 
is at most r. 

We may also state this result as 

Lemma 11. Let A have a nonzero r-rowed minor and let all (r + l)-rowed 
minors of A vanish. Then r is the rank of A. 

Note that we are assigning zero as the rank of the zero matrix and that 
the rank of any nonzero matrix is at least 1. 

The problem of computing the rank of a matrix A would seem from our 
definition and lemmas to involve as a minimum requirement the computa¬ 
tion of at least one r-rowed minor of A and all (r + l)-rowed minors. The 
number of determinants to be computed would then normally be rather 
large, and the computations themselves generally quite complicated. How¬ 
ever, the problem may be tremendously simplified by the application of 
elementary transformations. We are thus led to study the effect of such 
transformations on the rank of a matrix. 

Let then A o result from the application of an elementary row transforma¬ 
tion of either type 1 or type 3 to A. By Lemmas 3 and 5 every ^-rowed 
minor of A o is the product by a nonzero scalar of a uniqudy corresponding 
^-rowed minor of A, and it follows that A and Ao have the same rank. If 
Ao results when we add to the ith row of A the product by c 0 of its gih 
row and B is a ^-rowed square submatrix of A, the correspondingly placed 
submatrix Bo of Ao is equal to B if no row of B is a part of the ith row of A. 
If, however, a row of B is in the ith row of A and a row of B is in the ^th 
row of B, then by Lemma 2.4 we have | Bo | = | B |. If, finally, a row of B 
is in the ith row of A but no row is in the ^th row of A, then by Lemma 8 
I Bo I = |B| -t-c|C|, where C is a ^-rowed square matrix all but one of 
whose rows coincide with those of B, and this remaining row is obtained by 
replacing the elements of B in the ith row of A by the correspondingly 
columned elements in its gih row. But then it is easy to see that ± 1C | is 
a minor of A as well as of Ao. If A has rank r we pujb t = r + l and see 
that |B| = |C| =0, I Bo I = 0 for every (r + l)-rowed minor |Bo| of Ao. 
Also there exists an r-rowed minor |B| 0 in A, and our proof shows the 

existence of a corresponding minor |Bol = |B| or |Bo| = |Bl +c|Cl in 
Ao. But then jBoj = 0 implies that \C\ = —c'^B] 9 ^ 0, and Ao has a 
nonzero r-rowed minor ± |C1, A and Ao have the same rank. 

We observe, finally, that if an elementary row transformation be ap¬ 
plied to the transpose A' of A to obtain Ai and the corresponding column 
transformation be applied to A to obtain Ao, then AJ = Ai. By the above 
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proof A' and Ai have the same rank, by Lemma 2 A and A' have the same 
minors and hence the same rank, so that Ai and A[ = Ao have the same 
rank. Hence, A and Aq have the same rank. Thus, any two rationally 
equivalent matrices have the same rank. 

Conversely, let the rank r of two m by n matrices A and B be the same 
and use Lemma 9 to carry A into a rationally equivalent matrix (26). It 
is clear that the rank of the matrix in (26) is the number of rows in the ma¬ 
trix /. By the above proof A and this matrix have the same rank, I = /r, 
A is rationally equivalent to 



Similarly, B is rationally equivalent to (30) and to A. We have thus proved 
what we regard as the principal result of this chapter. 

Theorem 2. Two m by n matrices are rationally equivalent if and only if 
they have the same rank. 

We have also the consequent 

Corollary. Every m by n matrix of rank r is rationally equivalent to an 
m by n matrix (30). 

A matrix is called nonsingular if it is a square matrix and its determinant 
is not zero. But then Theorem 2 implies, as in Ex. 4 of Section 6, 

Theorem 3. Every n-rowed nonsingular matrix is rationally equivalent to 
the n-rowed identity matrix. 

In closing let us observe a result of the application to a matrix of either 
row transformations only or column transformations only. We shall prove 

Theorem 4. Every m by n matrix of rank r > 0 may be carried into a 
matrix of the forms 

(31) (q), (H 0), 

respectivelyy by a sequence of elementary row or column transformations only^ 
where G is an r~rowed matrix, H is an r-columned matrix, and both G and H 
have rank r. 

For A is equivalent to (30) by a sequence of elementary row and column 
transformations. Clearly, we may obtain (30) by first applying all the row 
transformations and then all the column transformations. If we then apply 
the inverses of the column transformations in reverse order to (30), we ob¬ 
tain the result of the application of the row transformations alone to A. 
But column transformations applied to (30) clearly carry this matrix into 
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a matrix of the form given by the first matrix of (31). Moreover, it is evi¬ 
dent that the rank of this matrix is that of G, G has rank r. The result for 
column transformations is obtained similarly. 


EXERCISES 

1. Compute the rank r of the following matrices by using elementary transforma¬ 
tions to carry each into a matrix with all but r rows (or columns) of zeros and an 
obvious r-rowed nonzero minor. 


a) 


c) 


1 1 3\ 

15 2 


(2 - 1 
6) 1 3 


5 -1/ 

1 

\i 

-11 


14/ 



/6 

4 

3 

-84 

- 3 


(1 

2 

3 

-48 

-12 


d) 1 

-2 

1 

-12 

- 9 

12/ 

\4 

4 - 

1 

-24 


2. Carry the first of each of the following pairs of matrices into the second by 
elementary transformations. Hint: If necessary carry A and B into the form (30) 
and then apply the inverses of those transformations which carry B into (30) to 
carry (30) into B. 


o) ^ = (j 



-2 1 \ /2 0 - 1 \ 

-2 2), R = (6 0 3] 

-4 1/ \1 0 1/ 


-2 1\ /I 1 0\ 

-42), R == (1 1 0) 

-6 3 / Vo 0 0/ 



CHAPTER III 

EQUIVALENCE OF MATRICES AND OF FORMS 

1. Multiplication of matrices. If xi, ... , Xm and y\, ... ,yn are variables 
related by a system of linear equations 

n 

(1) Xi = (f = 1, . . . , m) , 

i-i 

this system was said in Section 1.9 to define a linear mapping carrying the 
Xi to the yj. We call the m by n matrix A = {a^) the matrix of the map- 
ping (1). 

Suppose now that 2i, . . . , are variables related to i/i, . . . , pn by a 
second linear mapping 

(2) Vi = ^hkZk (j = 1, . . . , n), 

ib-l 

with n by g matrix B = (6,*), carrying the to the z*. Then, if we sub¬ 
stitute (2) in (1) we obtain a third linear mapping 

Q 

(3) Xi = ^CikZk (i = 1, . . . , m) , 

with m by g matrix C = (ci*), and it is easily verified by substitution that 

n 

(4) Cijb = ^aijbjk 

y-i 

= 1, . . . , m; fc = 1, . . . , g) . 

The linear mapping (3) is usually called the product of the mappings (1) 
and (2), and we shall also write C = AB and call the matrix C the product 
of the matrix A by the matrix B, 

We have now defined the product AB of an m by n matrix A and an 
n by g matrix B to be a certain m by g matrix C. Moreover, we have de¬ 
fined C so that the element c** in its ith row and fcth column is obtained 

36 
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as the sum dk = anbik + 01262* + . . . + dinbnk of the products of the 
elements a*, in the ith row 

(5) (Oilf Oi2| • • • > Otn) 


of A, by the corresponding elements 6,* in the fcth column 


( 6 ) 


Ibik \ 
b%k I 

1 6nA: / 


of B, Thus we have stated what we shall speak of as the row by column rule 
for multiplying matrices. 

Observe that if either A or B is a zero matrix the product AB is also. 
Moreover, if A is an m by n matrix, then we have 

(7) In.A = AJn = A , 


where Ir represents the r-rowed identity matrix defined for every r as in 
Section 2.6. Observe also that Im is the matrix of the linear transforma¬ 
tion X* = 2/< for i = 1, . . , , m, and thus in this case the product of (1) 
by (2) is immediately (2); hence (7) is trivially true. 

We have not defined and shall not define the product AB of two matrices 
in which the number of columns in A is not the same as the number of rows 
in B. Then, evidently, the fact that AB is defined need not imply that BA 
is defined. But when both are defined, they are generally not equal and 
may not even be matrices of the same size. This latter fact is clearly so if, 
for example, A is ?7i by n, B is n by m, and m 9 ^ n. 

If A and B are w-rowed square matrices, we shall say that A and B are 
commutative if AB = BA. Note the examples of noncommutative square 
matrices in the exercises below. 

Finally, let us observe the following 

Theorem 1. The transpose of a product of two matrices is the product of 
their transposes in reverse order. 

In symbols we state this result as 

(8) (AB)' = B'A'. 


Here A is an m by n matrix, B is an n by g matrix, (AB)' is a g by m ma¬ 
trix, which we state is the product of the g by n matrix B' and the n by m 
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matrix A'. We leave the direct application of the row by column rule to 
prove this result as an exercise for the reader. 


EXERCISES 

1. Compute (3) and hence the product G = AB for the following linear changes 
of variables (mappings) and afterward compute C by the row by column rule. 


[ xi = 2yi + Vi 

a) j X2 = 3j/i - 2/2 
Ua = yi + 2 y 2 


f 2/1 = 2i - *2 + 

( 2/2 = -2zi + Zi- 3zz 


{xi = 2yi + 32/2 — 2/3 
b) j X2 = 2/1 + 2/2 + 42/3 
Us = 2/2 - 92/3 


(yi— — 133i “b 262^2 “b 1323 
■j 2/2 = 92 i — 1822 — 923 

12/3 = et— 2zi - 23 


2. Compute the following matrix products AB. Compute also BA in the cases 
where the latter product is defined. 


/4 3 2\ /-I -2 -3 

a) 3 2 11 1 3 0 

\2 1 -1/ \-l -2 4, 


c) 




• eisH; 


d) 


-1 3 4) 


e) " (-1 2 3 2) 


3. Let the symbol Ea represent the three-rowed square matrix with unity in the 
ith row and jih. column and zeros elsewhere. Verify by explicit computation that 
EijEjk = Eik and that \i j g then EijEgk = 0. 


2. The associative law. It is important to know that matrix multiplica¬ 
tion has the property 

(9) (AB)C = A{BC) 


for every m by n matrix A, n by ^ matrix B, qhy 8 matrix C. This result 
is known as the associative law for matrix multiplication and it may be 
shown as a consequence that no matter how we group the factors in forming 
a product AxAt the result is the same. In particular, the powers A^ of 
any square matrix are unique. We shall assume these two consequences of 
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(9) without further proof and refer the reader to treatises on the founda¬ 
tions of mathematics for general discussions of such questions. 

To prove (9) we write A = (at,), B = C = {cu), where in all 
cases 1 = 1, . . . , m; j = 1, . . . , n; A; = 1, . . . , O'; Z = 1, . . . , s. Then it 
is clear that the element in the ith row and Zth column of (7 = {AB)C is 

( 10 ) > 

4=i\y-i / 

while that in the same position in H = A(BC) is 

n / Q 

( 11 ) hil = 

j=l \k=l 

Each of these expressions is a sum of nq terms which are respectively of the 
form {aijbjk)cki and a„(6,ACA0* R'lt those terms in the respective sums with 
the same sets of subscripts are equal, since we have already assumed that 
the elements of our matrices satisfy the associative law a(6c) = (a6)c. 
Hence, gu = hn for all i and I, G = Hy and (9) is proved. 

EXERCISES 

Compute the products {AB)C and A{BC) in the following cases. 

/2-1 3\ 
a) A = 3 12, 

Vo -1 1/ 


c) A = (1 2 -1 3), 


3. Products by diagonal and scalar matrices. Let A = (ui,) be an by n 
matrix and R be a diagonal matrix, and designate the ith diagonal element 
of B by bi. Then our definition of product implies that if B is 7^-rowed so 
that BA is defined, then 

(12) BA = (biau) 

(2 = 1, . . . , m; j = 1, . . . , n) ; 


'’■(-! i)’ ° 

\l 3/ 
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while if B is n-rowed, then 

(13) AB = (o«b,) 

(t = 1 , . . . , m; y = 1 , . . . , n). 

Thus the product of a matrix A by a diagonal matrix B on the left is ob¬ 
tained as the result of multiplying the rows of A in turn by the correspond¬ 
ing diagonal elements of B; the product of A by a diagonal matrix on the 
right is the result of multiplying the columns of A in turn by the corre¬ 
sponding diagonal elements of B, 

Now, let m = n, A be a square matrix. Then from ( 12 ) and (13) we see 
that AB = BA if and only if 

(14) {bi “• bj)aij = 0 (z, j = 1, . . . , n). 

As an immediate consequence of the case 6 i = . . . = 6 n we have the very 
simple 

Theorem 2. Every Jx~rowed scalar matrix is commutative with all n-rowed 
square matrices. 

We next see that, if i 5 ^ j and 5< 5 *^ 6 /, then (14) implies that a*,- == 0. 
This gives the result we shall state as 
Theorem 3. Let the diagonal elements of an n-rowed diagonal matrix B he 
all distinct. Then the only n-rowed square mairices commutative with B are 
the n-rowed diagonal matrices. 

We may now prove the converse of Theorem 2 —a result which is the 
inspiration of the name scalar matrix. 

Theorem 4. The only n-rowed square matrices which are commutative with 
every n-rowed square matrix are the scalar matrices. 

For let 

(15) B = (bii) {i, j, = 1 , . . ., n) 

and suppose that B is commutative with every n-rowed square matrix A. 
We shall select A in various ways to obtain our theorem. First, we let Ej be 
the diagonal matrix with unity in its jth row and column and zeros else¬ 
where and put BEj = EjB. Equations ( 12 ) and (13) imply that the jth 
row of EjB is the same as that of B and the jth column of BEj is the 
same as that of B, while all other columns of BEf are zero. Thus if i ^ j, 
the elements in the ith column of EjB must be zero. Since bn is in the 
ith column, we have bn = 0 for j ^ i, and B is a diagonal matrix. If D, 
is the matrix with 1 in its first row and jth column and zeros elsewhere, the 
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product BDj has bn in its first row and jth column and is equal to the 
matrix Dyfi which has 6yy in this same place. Hence, 6y, = bn = bforj = 2, 
. . . , n, and B is the scalar matrix bln- 

Let us now observe that if A is any m by n matrix and a is any scalar, 
then 

(16) (aIndA = A(aln) 

is an m by n matrix whose element in the ith row and jth column is the 
product by a of the corresponding element of A. This is then a type of 
product 

(17) aA = Aa 


like that defined in Chapter I for sequences, and we shall call such a product 
the scalar product of a by A. However, we have defined (17) as the in¬ 
stances (16) of our matrix product (4). 


EXERCISES 

1, Compute the products AB and BA by the use of (12) and (13) as well as the 
row by column rule if 


/I 0 0\ 

a) A = 0 -2 0 , 

\0 0 3/ 


2 4 -5' 

J5 = I -4 0 1 

^-1 -3 -2. 


/I 0 0\ 

6) A = 0 -1 0 , 

Vo 0 0/ 


B 


-1 0-2 
2 1 4 

-2 3 -3 



0 

0 




0 

-1 


/ 0 

2 

0 

o' 

!. B = \ 

'0 

0 

0 

-11 

(o 

0 

-2 


ii 

0 

0 

0 

\o 

0 

0 

- 2 I 


lo 

1 

.0 

0 / 


/3 0 0\ 

d) A = (0 2 0) , 

Vo 0 1/ 


B 


0 1 0 

0 0 1 

2 0 0 . 


2. Find all three-rowed square matrices B such that BA = AB if 


/I 0 0\ 
o) ^ = 0 -1 0) 

Vo 0 2/ 


/O 1 0\ 

6) A = (0 0 1) 

Vl 0 0 / 



42 INTRODUCTION TO ALGEBRAIC THEORIES 
3. Prove by direct multiplication that BA = —AB if i* = —1 and 



4. Let ti; be a primitive cube root of unity. Prove that BA = wAB if 

. /I 0 0 \ /O 1 0\ 

A = (0w?0), B=(0 0l) 

\0 0 W \a 0 0/ 

5. Show that the matrix B of Ex. 4 has the property B® = oZ and that the 

matrix A has the property A^ — L Obtain similar results for the matrices of Ex. 3. 

6. Compute BAB if 



where c and d are any ordinary complex numbers, c and d are their conjugates. 

4, Elementary transformation matrices. We shall show that the matrix 
which is the result of the application of any elementary row (column) trans¬ 
formation to an m by n matrix A is a corresponding product EA (the prod¬ 
uct AE) where B is a uniquely determined square matrix. We might of 
course give a formula for each E and verify the statement, but it is simpler 
to describe E and to obtain our results by a device which is a consequence 
of the following 

Theorem 6. Let A be an m by n matrix, B be an nby matrix so that 
C = AB is anmby q matrix. Apply an elementary row {column) transforma¬ 
tion to A {to B) resulting in what we shall designate by Ao(by B^®^, and then 
the same elementary transformation to C resulting in Co {in Then 

(18) Co = AoB , C(®) = AB(o). 

For proof we see that if we replace a,y by o„ in the right members of (4) 
we get c,fc. Thus we obtain the result of an elementary row transformation 
of type 1 on AB by applying it to A. Our definitions also imply that for 
elementary row transformations of type 3 the result stated as (18) follows 
from 

n n 

(19) {aaiffbjk ~ Of^^^aijhjie = acik 
; » 1 ;■ » 1 

(i = 1, . . . , m; fc = 1, . . . , . 
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Finally, we see that for type 2 they follow from 

(20) "I" ca„)bjk = "h ^ ~ "I" 

(i = 1, . . . ,n;k = 1, . . . ,q), 

and we have proved the first equation in (18). The corresponding column 
result has an obvious parallel proof or may be thought of as 

being obtained by the process of transposition. It is surely unnecessary to 
supply further details. 

We now write A = lA where I is the ?w-rowed identity matrix and apply 
Theorem 5 to obtain Aq = I (A, in which /o is the matrix called E above. 
Then we see that to apply any elementary row transformation to A we 
simply multiply A on the left by either E^, P<,(c), or R<(a). Here we define 
Eij to be the matrix obtained by interchanging the ith and jth rows of 
I, Pij{c) by adding c times the jth row of / to its tth row, R<(a) by multiply¬ 
ing the ith row of / by a 0. We shall call F,-,-, Paic), and Ri{a) elementary 
transformation matrices of types 1, 2, and 3, respectively. 

Observe that 

(21) E'ij = Eji = Eij , EijEij = I , 

so that, if we now assume that F<,- is n-rowed, the product AE^ is the result 
of an elementary column transformation of type 1 on A. Similarly, 

(22) [P«(c)]' = Pji{c ), Pij(-c)Pij(c) = Pij(.c)Pij(-c) = /, 

and if P, j(c) is n-rowed, then AP,<(c) is the result of an elementary column 
transformation of type 2 on A. Finally, 

(23) [Pi(o)]' = Riia ), Riia~^)Ri(,a) = Ri{a)Riia-^) = /, 

and if Riia) is n-rowed, then ARi{a) is the result of ah elementary column 
transformation of type 3 on A. Thus the elementary column transforma¬ 
tions give rise to exactly the same set of elementary transformation mat¬ 
rices as were obtained from the row transformations. 

We shall now interpret the results of Section 2.7 in terms of elementary 
transformation matrices. First of all, we may interpret Theorem 2.2 as the 
following 
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Lemma 1. Let A and B be m by n matrices. Then there exist elementary 
transformation matrices Pi, , P„ Qi, . . . , Qt such that 

B = (Pi. . . P.)A(Qi. . . Qt) 

if and only if A and B have the same rank. 

Theorem 2.3 is the case of Lemma 1 where A is the w-rowed identity 
matrix, and consequently B = Pi. . . P,Qi . ., Qt, Thus we obtain 

Lemma 2. Every nonsingular matrix is a product of elementary transforma¬ 
tion matrices. 

We shall close the results with an important consequence of Theorem 2.4. 

Theorem 6. The rank of a product of two matrices does not exceed the rank 
of either factor. 

For, by Theorem 2.4, if the rank of A is r, there exists a sequence of ele¬ 
mentary transformations carrying A into an m by n matrix Aq whose bot¬ 
tom m — r rows are all zero. By Theorem 5, if we apply these transforma¬ 
tions to C = ABf we obtain a rationally equivalent matrix Co = AqB, Then 
the bottom m — r rows of the m-rowed matrix Co are all zero, and the 
rank of Co is the rank of C and is at most r, as desired. The corresponding 
result on the relation between the ranks of C and B is obtained similarly. 

EXERCISES 

1. Express the following as products of elementary transformation matrices. 



2. Find the elementary transformation matrices corresponding to the elementary 
row transformations used in Ex. 2 of Section 2.6 and carry the matrices of that exer¬ 
cise into the form (2.30) by matrix multiplication. 

6. The determinant of a product. In the theory of determinants it is 
shown that the symbol for the product of two determinants may be com¬ 
puted as the row by column product of the symbols for its factors. In our 
present terminology this result may be stated as 

Theorem 7. The determinant of a product of two square matrices is the 
product of the determinants of the factors, that is, 

(24) lABl = IM . |B1 . 

The usual proof in determinant theory of the result is quite complicated, 
and it is interesting to note that it is possible to derive the theorem as a 
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simple consequence of our theorems which were obtained independently 
and which we should have wished to derive even had Theorem 7 been as¬ 
sumed. We shall, therefore, give such a derivation* We thus let A and B 
be n-rowed square matrices and see that Theorem 6 states that if | A | =0 
then \AB 1 = 0. Hence, let A be nonsingular so that, by Lemma 2, 

A = Pi. . . P,, 

where the P< are elementary transformation matrices. From our defini¬ 
tions we see that, if E is an n-rowed elementary transformation matrix of 
type 1, 2, or 3, then IPj = — 1, 1, or a, respectively. Thus, if G is an 
n-rowed square matrix and Go = EG, then Lemmas 2.3, 2.4, and 2.5 imply 
that |Go| = — IG^l, 1GI, a|GI, respectively, and hence IGol = \E\ • [Gj. 
It follows clearly that, if Pi, ..., Et are any elementary transformation 
matrices, then \EtEt-i . . . PiG| = \Et\ ... |Pil • \G\, We apply this re¬ 
sult first to A to obtain \A\ = jPil ... |P,| and then to AB to obtain 
|AP| = |Pi . . . P.B| = |Pi| . . . |P.| . |B| = |A| . 1P| as desired. 

6. Nonsingular matrices. An n-rowed square matrix A is said to have an 
inverse if there exists a matrix B such that AB — BA = 7 is the n-rowed 
identity matrix. Clearly, B is an n-rowed square matrix which we shall 
designate by A"”^ and thus write 

(25) AA-i = A-^A = I . 

Moreover, we have 

Theorem 8. A square matrix A has an inverse if and only if A is non¬ 
singular. 

For if (25) holds, we apply Theorem 7 to obtain |A 1 • 1A“^1 = |71 == 1, 
IA I 7 *^ 0. The converse may be shown to follow from (21), (22), (23), and 
Lemma 2; but we shall prove instead the result that if 1A | 0 then A*”^ is 
uniquely determined as the matrix 

(26) A”^ = lAI”^ • adj A . 

This formula follows from the matrix equation 

(27) A(adj A) = (adj A)A = |A| • 7 

by multiplication by the scalar | A1“^ and we observe that (27) is the in¬ 
terpretation of (2.20) and (2.21) in terms of matrix product, where adj A is 
our symbol for the adjoint matrix defined in (2.22). 
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We now prove unique by showing that if either AB = / or BA = I 
then B is necessarily the matrix A~~^ of (26). This is clear since in either 
case \A \ 5 *^ 0 , of (26) exists, = A“^/ = A’^^AB) = (A~^A)JS = 
IB = B, and similarly if BA = 7. 

We note also that, if A and B are nonsingular, 

(28) (AB)-i = B-^A-i. 

For (AB)(B~^A“0 = A(BB”^)A”^ = AA~^ = 7. Finally, if A is nonsingu¬ 
lar we have 

(29) (A~0' = (A')“' • 

For 7' = 7 = (AA"0' = (A"“0'A', (A”^)' is the inverse of A'. 

A linear mapping (1) with m = n has an inverse mapping ( 2 ) with 
n = q\i the product (3) is the identical mapping, that is, if C == AB is the 
identity matrix. But this is then possible if and only if | A j 7 *^ 0, that is, 
( 1 ) is what we called a nonsingular linear mapping in Section 1.9. We shall 
use this concept later in studying the equivalence of forms. 


EXERCISES 

1 . Show that if A is an n-rowed square matrix of rank n — 1 0, then adj A 

has rank 1 . Hint: By (27) we have A(adj A) = 0, PAQ{Qr^ • adj A) = 0 for 

2 )- 


Then the first n — 1 rows of Qr^ • adj A must be zero, adj A has rank 1 . 

2 . Use the result above and (28) to prove that adj (adj A) = [ A | "“^A if n > 2 

and adj (adj A) = A if n = 2 and | A | 0 . 

3. Use Ex. 1 and (27) to show that | adj A ] = | A | 

4. Compute the inverses of the following matrices by the use of (26). 


/h he - 


0 


/ ^ 

-2 


(1 c ) 

h) 1 

0 


c) 0 

3 

-2 


Vo 

1 

0 / 

V -2 

0 

3 / 
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5. Ijet A be the matrix of (d) above and B be the matrix of (e). Compute C = 
(ABy^ by (28) and verify that {AB)C = / by direct multiplication. 

7. Equivalence of rectangular matrices. Our considerations thus far have 
been devised as relatively simple steps toward a goal which we may now 
attain. We first make the 

Definition. Two mhy n matrices A and B are called equivalent if there 
exist nonsingular matrices P and Q such that 

(30) PAQ = B . 

Observe that P and Q are necessarily square matrices of m and n rows, 
respectively. By Lemma 2 both P and Q are products of elementary trans¬ 
formation matrices and therefore A and B are equivalent if and only if A and 
B are rationally equivalent. The reader should notice that the definition of 
rational equivalence is to be regarded here as simply another form of the 
definition of equivalence given above and, while the previous definition is 
more useful for proofs, that above is the one which has always been given 
in previous expositions of matrix theory. We may now apply Lemma 1 and 
have the principal result of the present chapter. 

Theorem 9. Tivo mby n matrices are equivalent if and only if they have 
the same rank. 

We emphasize in closing that, if A and B are equivalent, the proof of 
Theorem 2.2 shows that the elements of P and Q in (30) may be taken to 
be rational functions, with rational coefficients, of the elements of A and B. 

EXERCISES 

1. Compute matrices P and Q for each of the matrices A of Ex. 1 of Section 2.6 
such that PAQ has the form (2.30). Hint: If A is 7n by n, we may obtain P by ap¬ 
plying those elementary row transformations used in that exercise to and simi¬ 
larly for Q. (The details of an instance of this method are given in the illustrative 
example at the end of Section 8.) 

2. Show that the product AP of any three-rowed square matrices A and B of 
rank 2 is not zero. Hint: There exist matrices.P and Q such that Ao = PAQ has the 
form (2.30) for r = 2. Then, if AB = 0, we have AoPo = 0 where Po = Q“^P has 
the same rank as P and may be shown to have two rows with elements all zero. 

3. Compute the ranks of A, P, AB for the following matrices. Hint: Carry A 
into a simpler matrix Ao = PA by row transformations alone, P into Po = BQ by 
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column transformations alone, and thus compute the rank of AoBo = P{AB)Q in¬ 
stead of that of AB. 


o) A “ 


6) A == 



B 



B = 


( 


1 3 2\ 

2 6 4) 

-1 -4 1/ 


c) A = 


/3 -5 2 
( 1-21 
1 1 -1 0 
\5 -8 3 



( -4 10 2 
-2 6 2 
3-4 2 
-12 0 


8. Bilinear forms. As we indicated at the close of Chapter I, the problem 
of determining the conditions for the equivalence of two forms of the same 
restricted type is customarily modified by the imposition of corresponding 
restrictions on the linear mappings which are allowed. We now precede 
the introduction of those restrictions which are made for the case of bi¬ 
linear forms by the presentation of certain notations which will simplify our 
discussion. 

The one-rowed matrices 

(31) x' = (*1, ..., ®m) , y' = (yi,... ,yn) 

have one-columned matrices x and y as their respective transposes. We 
let A be the m by n matrix of the system of equations (1) and see that this 
system may be expressed as either of the matrix equations 

(32) X = Ay, *' = y'A '. 

We have called (1) a nonsingular linear mapping if m = n and A is non- 
singular. But then the solution of (1) for yi,... ,yn as linear forms in 
xi,..., is the solution 

(33) y^A^x, y' - - x'{A-y 

of (32) for y in terms of x (or y’ in terms of x'). We shall again consider 
m by n matrices A and variables xi and yy and shall now introduce the no¬ 
tation 


(34) 


* « P'u 
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for a nonsingular linear mapping carrying the Xi to new variables Uk, for i, 
A; = 1, . . . , m, so that the transpose P of P' is a nonsingular m-rowed 
square matrix and w' = (lii, . . . , Um). Similarly, we write 

(35) y = Qv 

for a nonsingular n-rowed square matrix Q and t;' = (vi, . . . , Vn). We now 
return to the study of bilinear forms. 

A bilinear form / = Sxiat/V/) for i = 1, . . . , m and j = 1, . . . , n, is a 
scalar which may be regarded as a one-rowed square matrix and is then the 
matrix product 

(36) / = x'Ay . 

Here x and y are given by (31), and we call the m by n matrix A the matrix 
of f, its rank the rank of f. Also let gf = x'Pj/be a bilinear form in xi, . . , ,Xm 
and 2/1, . . . , with m by n matrix P. Then we shall say that / and g are 
equivalent if there exist nonsingular linear mappings (34) and (35) such 
that the matrix of the form in Wi, . . . , i/m and Vi, . . . , Vn into which / is 
carried by these mappings is B. But if (34) holds, then x' = w'P and 

/ = {u'P)A{Qv) - u'(PAQ)v , 

so that B = PAQ and A are equivalent. Thus, two bilinear forms f and g 
are equivalent if and only if their matrices are equivalent By Theorem 9 we 
see that two bilinear forms are equivalent if and only if they have the same 
rank. It follows also that every bilinear form of rank r is equivalent to 
the form 

(37) xiyi + . . . + Xryr . 

These results complete the study of the equivalence of bilinear forms. 
ILLUSTRATIVE EXAMPLE 

We shall find nonsingular linear mappings (34) and (35) which carry the form 
/ = 2xiyi - 3xi2/2 + xi2/3 ~ x^i + 5x22/8 - 6x82/i + 3x82/2 + 19x82/8 
into a form of the type (37). The matrix of / is 


A = 


2 -3 r 
-10 5 

-6 3 19. 
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We interchange the first and second rows of A, add twice the new first row to the 
new second row, add —6 times the new first row to the third row, and obtain 



which evidently has rank 2, We then add the second row to the third row, multiply 
the first row by — 1, the second row by — J, and obtain 



The matrix P is obtained by performing the transformations above on the three- 
rowed identity matrix, and hence 


P = 




We continue and carry PA into PAQ of the form (2.30) for r = 2 by adding five 
times the first column and V times the second column of PA to its third column. 
Then 


0 - 



The corresponding linear mappings (34) and (35) are given respectively by 


I Xi = -"JU2 + Ui 

j X2 = —Ul — — 4U8 , 

[xs = U8 


!/i = vi + 5Vi 
^2 = • 
ya = Vs 


We verify by direct substitution that 


/ = (~Jt ^2 + U8)(2 vi + lOvs “ SV2 - llVa + Vs) 

+ (wi + iuz + ^U8)iVi + 5V8 - bvs) 

-}- ‘U8(”*fit^i — 30vj 4“ 3^2 "f* 11^8 4“ lOvj) 
= — Jw 2 «^l 4- U2V2 4- 2 ^ 8 ^! — 3188^2 4- U\Vi 4- \y^V\ 4“ ^UsVi — 6 W 8 Vi 4- ^UsVs 

= uiVi 4“ W 2 , 


as desired. 


EXERCISES 

1. Use the method above to find nonsingular linear mappings which carry the 
following bilinear forms into forms of the type (37). 
a) 2xiyi 4- ^X)y% — xiyi — 2x^2 

h) xiyi - xiys 4- xiys 4- 4x2^1 - 8x2^2 4- 2x2^8 — xsyi 4- 2x3^2 - 3x8^8 
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c) 2xiyi + 3x12/2 — X\yz + 6x22/1 + 2x22/2 -f xayz -f 3xa2/i — Xa2/2 -f 2xa2/8 

d ) 2 xi 2 /i - Xi 2/2 + 3 xi 2/8 + X12/4 ~ 2x22/1 + 3 x 22/2 - 13 x 22/3 + 6x22/4 + X82/i 

— a;82/8 + 2x82/4 + 4x42/i - X42/2 4- 2:42/8 + 6x42/4 

2 . Use the method above to find a nonsingular linear mapping on xi, X 2 , xz such 
that it and the identity mapping on 2 / 1 , ^ 2 , Vi carry the following forms into forms 
of the type (37). 

a) - 3 xi 2 /i + 6 x 12/2 + X 22/1 — 2 x 22/2 

b ) 3xi2/i + Xi 2/2 + 4x22/1 + 2x22/2 

c) Xi 2 /i — 4 xi 2/2 + a:i 2/8 — X 22/1 + 2 : 22/2 + 2 x 32/2 — Xzijz 

d ) 3 xi 2/2 + 62:12/3 + X22/2 + 2 x 22/3 + XzVi — 2x32/2 -f 2:32/8 

e ) ~Xi 2/2 + 2:12/8 4- 2:12/4 4- 2:22/2 4- 2x22/8 - 3x22/4 4- 2:82/1 - 2x32/2 4- 3x32/3 

4- 4x82/4 4- 2x42/2 — X42/8 — 3x42/4 

3. Find a nonsingular linear mapping on 2 / 1 , ^ 2 , 2/3 such that it and the identical 
mapping on Xi, X 2 , X 3 carry the forms of Ex. 2 into forms of the type (37). Hint: The 
matrices A of the forms of Ex. 2 are nonsingular so that the corresponding matrices 
P have the property PA = I. Then P = AQ = I has the unique solution 
Q^P. 


4. Use elementary row transformations as above to compute the inverses of the 
matrices of Section 6, Ex. 4. 


6. Use elementary column transformations to compute the inverses of the 
lowing matrices. 



fol- 


9. Congruence of square matrices. There are some-important problems 
regarding special types of bilinear forms as well as the theory of equivalence 
of quadratic forms which arise when we restrict the linear mappings we use. 
We let m = n so that the matrices A of forms/ = x'Ay are square matrices. 
Then (34) and (35) are called cogredient mappings if Q = P'. Then / and g 
are clearly equivalent under cogredient mappings if and only if 

(38) B = PAP '. 

We shall call A and B congruent if there exists a nonsingular matrix P 
satisfying (38), 
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We shall not study the complicated question as to the conditions that 
two arbitrary square matrices be congruent but shall restrict our attention 
to two special cases. 

Before passing to this study we observe that Lemma 2 and Section 7 
imply that A and B are congruent if and only if A may be carried into B by 
a sequence of operations each of which consists of an elementary row trans¬ 
formation followed by the corresponding column transformation. Thus 
P — P,... Pi, where the P* are elementary transformation matrices, B = 
P,.. . PiAP [. . . P; = P.. . . PaCPiAPOPJ ...P:, and so forth as de¬ 
sired. We shall speak of such operations on a matrix A as cogredient ele¬ 
mentary transformations of the three types and shall use them in our study 
of the congruence of matrices. 

10. Skew matrices and skew bilinear forms. A square matrix A is called 
symmetric if A' = A, and skew if A' = —A. If B = PAP' is congruent to 
A then B' = PA'P', B is symmetric if and only if A is ssrmmetric, B is 
skew if and only if A is skew. 

If A = (oy) is a skew matrix, then o,< = —an for all i and j. Hence 
an — —an and consequently every diagonal element of A is zero. We use 
this result in the proof of 

Theorem 10. Two n-rowed skew matrices are congruent if and only if they 
the same rank r. Moreover r is an even integer 2t, end every skew matrix is 
thus congruent to a matrix 

/O -It 0\ 

(39) It 0 0). 

\0 0 0 / 

For either A = 0 = PAP' for every P and our result is trivial, or some 
0,7 7 ^ 0. We may interchange the ith row and first row, the jth and sec¬ 
ond row and thus also the corresponding columns by cogredient elementary 
transformations of type 1. We thus obtain a skew matrix H = (Ay) con¬ 
gruent to A and with oy = Au 0, Aji = —Aw, An = hu — 0. Multiply 
the first row and column of H by Ic^i and obtain a skew matrix C — (cy) 
congruent to A and with cu = — 1, cji = 1, cn = c** = 0. We now apply 
a sequence of cogredient elementary transformations of type 2 where we 
add —Cji times the second row of C to its 7th row as well as ca times the 
first row of C to its jth row for j = 3, ..., n, and thus obtain the skew 
matrix 


(40) 
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The matrix Ao is congruent to A, Ai is necessarily a skew matrix of n — 2 
rows, and any cogredient elementary transformation on the last n — 2 rows 
and columns of A o induces corresponding transformations on Ai, carrying 
it to a congruent skew matrix Hi and carrying Ao to a congruent skew 
matrix 



It follows that, after a finite number of such steps, we may replace A by 
a congruent skew matrix 

(41) G = diag {Ei, . . . , 0, . . . , 0} 

with each Ek = E, Clearly, G and A have rank 2L If B also is a skew ma¬ 
trix of rank 2t, then B is congruent to (41) and to A, both A and B are con¬ 
gruent to (39). 

If A is a skew matrix, the corresponding bilinear form x'Ay is a skew 
bilinear form. Hence, two such forms are equivalent under cogredient non¬ 
singular linear mappings if and only if they have the same rank. Moreover, 
if / is a skew bilinear form of rank 2t it is equivalent under cogredient non¬ 
singular linear mappings to {xiy 2 — Xzyi) + (x2r~ij/2< — X 2 ty 2 M)- 

EXERCISES 

Use a method analogous to that of the exercises of Section 8 to find cogredient 
linear mappings carrying the following skew forms to forms of the type above. 

a) Xi 2/2 - X 2 yi + 2xiyz - 2xzyi -f- 8x22/3 ~ 8x32/2 

b) 2x22/1 — 2 xi2/2 “ Xi2/3 + xiyi + 2x2^3 - 2x32/2 

c ) Xi2/2 - X22/1 + 3 xi 2/8 ~ 3x32/1 - 4x42/i + 4xiy4 + 4x21/8 - 4x32/2 

+ 8X22/4 - 8 X 42/2 + 8 X 32/4 - 8x42/8 

11. Symmetric matrices and quadratic forms. The theory of symmetric 
matrices is considerably more extensive and compUcated than that of skew 
matrices, and we shall obtain only some of its most elementary results. Our 
principal conclusion may be stated as 

Theorem 11. Every symmetric matrix A is congruent to a diagonal matrix 
of the same rank as A. 

We may evidently assume that A = A' 5^ 0, and we shall prove first 
that A is congruent to a symmetric matrix H = (ha) with some diagonal 
element ha ^ 0. This is true for A = if some diagonal element of A is 
not zero. Otherwise there is some a<y 0, a^ = a*,, and an = a,-, = 0. We 
then obtain jH as the result of adding the jth row of A to its ith row and 
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the jth column of A to its ith column, hu = o,< + an — 2o<,' 0 as de¬ 

sired. We now permute the rows and corresponding columns of H to ob¬ 
tain a congruent symmetric matrix C with cu = hu 9 ^ 0. Then we add 
—CnCici times the first row to its fcth row, follow with the corresponding col¬ 
umn transformation, and obtain a symmetric matrix 

(o‘ A.) 

congruent to A, Clearly, Ai is a symmetric matrix with n — 1 rows. As 
in the proof of Theorem 10 we carry out a finite number of such steps, and 
it is clear that we ultimately obtain a diagonal matrix. It is a matrix equiv¬ 
alent to A and must have the same rank. 

The result above may be applied to obtain a corresponding result on 
symmetric bilinear forms of rank r, that is, bilinear forms / = x^Ay defined 
by a sjnnmetric matrix A of rank r. Theorem 11 then states that/is equiva¬ 
lent under cogredient transformations to a form 

(43) aiXiyi + . . . + arXryr . 

The results of Theorem 11 may also be applied to quadratic forms / = 
/(xi, . . . , Xn). As we saw in Section 1.7, / is the one-rowed square matrix 
product 

n 

(44) / = x'Ax = ^ XiaijXj 

for a symmetric matrix A, We call the uniquely determined symmetric 
matrix A the matrix of f and its rank the rank of f. Now in Section 1.9 we 
defined the equivalence of any quadratic form / and any second quadratic 
form g = y'By, We may then use the notations developed above and see 
that / and g are equivalent if and 6nly if A and B are congruent. For if 
our nonsingular linear mapping is represented by the matrix equation x = 
P'u, then a:' = u'P is a consequence, and/ = u\PAP^)u,f and g are equiv¬ 
alent if and only if P = PAP'. Thus Theorem 11 states that every quad¬ 
ratic form of rank r is equivalent to 

(45) aixl H- . . . + arX ^. 

The form (45) is of course to be regarded as a form in xi, . . . , Xn with 
matrix diag {ai,. . . , Ur, 0, . . . , 0}. However, it may be regarded as a 
form in Xi, . . . , Xr with nonsingular matrix diag {ai, . . . , Or}. We shall 
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call a quadratic form / = x^Ax in xi,.. . , Xn with nonsingular matrix A 
a nonsingular forniy and we have shown that every quadratic form of rank r in 
n variables may be written as a nonsingular form in r variables whose matrix 
is a diagonal matrix. 


EXERCISES 

1. What is the symmetric matrix A of the following quadratic forms? 

a) 3x1 + X 1 X 2 + 2 xiXs — 3 x 2 X 4 + xj 

b) 2 xiX 2 — 8x3X4 + xl 

c) X? — 3 xiX2 + 2x1X4 — 3xJ 


2. Find a nonsingular matrix P with rational elements for each of the following 
matrices A such that PAP' is a diagonal matrix. Determine P by writing A = 
lAP and by applying cogredient elementary transformations. 


a) 





c) 




/ 10 -3 3\ 

d) - 3 1-1 

\ 3 -1 1/ 


e) 


( 0 0 2 3\ 

0 0-2-3) 

2-2 0-1 
3 _3 -1 -3/ 


/) 


( 7 6 0 -1\ 

6 15 0 l\ 

0 0 2 3 I 

-113 5/ 


g) 


/-3 

-4 

1 


/ ^ 

2 

1 

1 

-5 

0 

- 5] 

h)i 1 

2 

1 

1 

0 

1 

1 

1 

1 

\ 0 

-5 

1 

14/ 

\-l 

-1 

0 


-1 

-1 

0 

0 


3. Write the symmetric bilinear forms whose matrices are'those of (a), (5), (c), 
and (d) of Ex. 2 and use the cogredient linear mappings obtained from that exercise 
to obtain equivalent forms with diagonal matrices. 

4. Apply the process of Ex. 3 to (e), (/), (g), and (h) of Ex. 2 for quadratic forms. 

5. Which of the matrices of Ex. 2 are congruent if we allow any complex num¬ 
bers as elements of Pf 

6. Show that the forms / = xf + xl and flf = x? — xl are not equivalent under 
linear mappings with real coefficients. Hint: Consider the possible signs of values 
of / and of g. 
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12. Nonmodular fields. In our discussion of the congruence and the 
equivalence of two matrices A and JS the elements of the transformation 
matrices P and Q have thus far always been rational functions, with rational 
coefficients, of the elements of A and B. While we have mentioned this 
fact before, it has not, until now, been necessary to emphasize it. But the 
reader will observe that we have not, as yet, given conditions that two sym¬ 
metric matrices be congruent, and our reason is that it is not possible to do 
so without some statement as to the nature of the quantities which we allow 
as elements of the transformation matrices P. We shall thus introduce an 
algebraic concept which is one of the most fundamental concepts of our 
subject—^the concept of a field. 

A field of complex numbers is a set F of at least two distinct complex 
numbers a, b, . . . , such thht a + 6, afe, a — 6, and a/c are in F for every 
a, fe, c 0 in F. Examples of such fields are, then, the set of all real num¬ 
bers, the set of all complex numbers, and the set of all rational functions 
with rational coefficients of any fixed complex number c* 

If F is any field of complex numbers, the set K = F(x) of all rational 
functions in x with coefficients in P is a mathematical system having prop¬ 
erties, with respect to rational operations, just like those of P. Now it is 
true that even if one were interested only in the study of matrices whose 
elements are ordinary complex numbers there would be a stage of this study 
where one would be forced to consider also matrices whose elements are 
rational functions of x. Thus we shall find it desirable to define the concept 
of a field in such a general way as to include systems like the field K defined 
above. We shall do so and shall assume henceforth that what we called con¬ 
stants in Chapter I and scalars thereafter are elements of a fixed field F. 

The fields we have already mentioned all contain the complex number 
unity and are closed with respect to rational operations. But it is clearly 
possible to obtain every rational number by the application to unity of a 
finite number of rational operations. Thus all our fields contain the field 
of all rational numbers and are what are called nonmodular fields. The fields 
called modular fields will be defined in Chapter VI. We now make the fol¬ 
lowing brief 

Definition. A set of elements F is said to form a nonmodular field if F 
contains the set of all rational numbers and is closed with respect to rational 
operations such that the following properties hold: 

I. (a + b) + c == a + (b “h c), (ab)c = a(bc) ) 

II. a(b + c) = ab + ac ; 

III. a + b = b“f"a, ab =* ba 
for every a, b, c of F. 
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The difference ct — 6 is always defined in elementary algebra to be a solu¬ 
tion X in F of the equation 

(46) X + b — a f 

and the quotient a/h to be a solution y in F of the equation* 

(47) yb = a. 

Thus our hypothesis that F is closed with respect to rational operations 
should be interpreted to mean that any two elements a and 5 of F determine 
a unique sum a + b and a unique product ab in F such that (46) has a 
solution in F and (47) has a solution in F if 6 5*^ 0. In the author’s Modern 
Higher Algebra it is shown that the solutions of (46) and (47) are unique. 
In fact, it may be concluded that the rational numbers 0 and 1 have the 
properties 

(48) a + 0 = al = a, a0 = 0, 

for every a of F and that there exists a unique solution x = “"aofx-t-a = 
0 and a unique solution y = 6”^ of y6 = 1 for 6 5*^ 0. Then the solutions 
of (46) and (47) are uniquely determined byx = a+ (—fe),y = 

We also see that the rational number —1 is defined so that ( —1) -f 
1=0, and thus (—1 + l)a = 0 • a = 0, whereas (—1 + l)a = — 1 • o + 
l*a = -“l*a + a. Hence —a = — 1 • a. It is also true that —( — a) = 
a, ( — a) ( — 6) = ab for every a and 6 of a field F. 


EXERCISES 


1. Let a, 6, and c range over the set of all rational numbers and F consist of all 
matrices of the following types. Prove that F is a field. (Use the definition of ad¬ 
dition of matrices in (52).) 



«r- 





2. Show that if a, 6, c, and d range over all rational numbers and — 1, the 
set of all matrices of the following kind form a quasi-field which is not a field. 

/a bi 3(c 4“ di)\ 

\c — di a — bi / 

* Note that in view of III the property II implies that (6 -f c)a - ba + ca^ and the 
existence of a solution of (46) is equivalent to that of 6 + x = o, of (47) to that of 
by = o. But there are mathematical systems called quasi-fields in which the law ab = ba 
does not hold,>nd for these systems the properties just mentioned must be made as 
additional assumptions. 
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13. Summary of results. The theory completed thus far on polynomials 
with constant coefficients and matrices with scalar elements may now be 
clarified by restating our principal results in terms of the concept of a field. 
We observe first that if f(x) and g(x) are nonzero polynomials in x with 
coefficients in a field F then they have a greatest common divisor d(x) with 
coefiicients in F, and d{x) = a{x)f(x) + b(x)g{x) for polynomials a(x), b{x) 
with coefficients in F. 

We next assume that A and B are two mhyn matrices with elements in 
a field F and say that A and B are equivalent in F if there exist nonsingular 
matrices P and Q with elements in F such that PAQ = B, Then A and B 
are equivalent in F if and only if they have the same rank. Moreover, corre¬ 
spondingly, two bilinear forms with coefficients in F are equivalent in F if 
and only if they have the same rank. Since the rank of a matrix (and of a 
corresponding bilinear form) is defined without reference to the nature of 
the field F containing its elements, the particular F chosen to contain the 
elements of the matrices is relatively unimportant for the theory. 

If A and B are square matrices with elements in a field F, then we call A 
and B congruent in F if there exists a nonsingular matrix P with elements 
in F such that PAP' = B, Similarly, we say that the bilinear forms x'Ay 
and x'By are equivalent in F under cogredient transformations* if A and B 
are congruent in F. When A' = — A the matrix A is skew, and every ma¬ 
trix B congruent in F to A is skew, two skew matrices with elements in F 
are congruent in F if and only if they have the same rank. Hence, the pre¬ 
cise nature of F is again unimportant. 

Let A = A' be a symmetric matrix with elements in F so that any ma¬ 
trix B congruent in F to A also is a symmetric matrix with elements in F. 
Then two corresponding quadratic forms x'Ax and x'Bx are equivalent in 
F if and only if A and B are congruent in F. Moreover, we have shown 
that every symmetric matrix of rank r and elements in F is congruent in F 
to a diagonal matrix diag {oi, . . . , Or, 0, . . . , 0} with a* 0 in F and 
that correspondingly every quadratic form x'Ax is equivalent in F to 
aix\ + . . . + arxi. 

The problem of finding necessary and sufficient conditions for two quad¬ 
ratic forms with coefficients in a field F to be equivalent in F is one in¬ 
volving the nature of F in a fundamental way, and no simple solution of this 
problem exists for F an arbitrary field. In fact, we can obtain results only 
after rather complete specialization of F, and these results may be seen to 
vary as we change our assumptions on F. 

* We leave to the reader the explicit formulation of the definitions of equivalence in 
F of two forms, of two bilinear forms, and of two bilinear forms under cogredient linear 
mappings, where all the forms considered have elements in F, 
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The simplest conditions are those given in 

Theorem 12. Let ¥ he a field with the property that for every a o/ F there 
exists a quantity b such that = a. Then two symmetric matrices with ele¬ 
ments in F are congruent in F if and only if they have the same rank. 

For every A == A' of rank r is congruent to A© = diag {ai, . . . , Ur, 0, 
. . . , 0} for a* 7*^ 0 and a* = for hi 9 ^ 0 in F. Then if P = diag , 

0 , . . . , 0}, PAqP' = diag {/r, 0}. If also B = B' has rank r, then B is 
also congruent in F to diag {/r, 0 } and hence to A. The converse follows 
from Theorem 2.2. 

We then have the obvious.consequences. 

Corollary I. Two symmetric matrices whose elements are complex num¬ 
bers are congruent in the field of all complex numbers if and only if they have 
the same rank. 

Corollary II. Let F he the field of either Theorem 12 or Corollary I. 
Then two quadratic forms with coefficients in F are equivalent in F if and only 
if they have the same rank. Hence every such form of rank r is equivalent 
in F to 

(49) xj + . . . + X?. 

14. Addition of matrices. There is one other result on symmetric matrices 
over an arbitrary field which will be seen to have evident interest when we 
state it. Its proof involves the computation of the product of two matrices 



which have been partitioned into two-rowed square matrices whose ele¬ 
ments are rectangular matrices. If these matrices were one-rowed square 
matrices, that is to say in P, we should have the formula 


(51) 


/AiBi + A2P8 A1P2 + A 2 P 4 \ 
\A3P1 + A4P3 A3B2 + A4P4/ 


But it is also true that if the partitioning of any A and B is carried out so 
that the products in (51) have meaning and if we define the sum of two 
matrices appropriately, then (51) will still hold. Thus (51) will have major 
importance as a formula for representing matrix computations. 

We now let A = (a*y) and B = (ft*,), where i = 1, . . . , m and j = 1, 
. . . , n, so that A and B are m by n matrices. Then we define 

(52) <S = A + P = {sij) , Sij = a<y + 6<y 

(i = 1, . . . , = 1, . . . ,n) . 
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We have thus defined addition for any two matrices of the same shape such 
that A + R is the matrix whose elements are the sums of correspondingly 
placed elements in A and B. 

The elements of our matrices are in a field F, and an + bn = bn + an- 
If C = (Cj/) is also an m by n matrix, we have (an + &</) + cn = an + 
(bn + Cij)- Hence wp have the properties 

(53) A+B = B-\-A, (A+R) + C = A + (R + C). 

We observe also that if 0 is the m by n zero matrix, then A + 0 = A, 
A -|- (—1 * A) = 0. 

Now let C = (c,*) be any n by g matrix. Then 

(54) (A + B)C = AC + RC. 

For this equation is clearly a consequence of the corresponding property 
of F, that is, of 

(55) (an + i>ij)c,t = anCik + bifiik • 

We have thus seen that addition of matrices has the properties (53) al¬ 
ways assumed for addition of the elements of our matrices and that the 
law (54), which we call the distribvUive law for matrix addition and multipli¬ 
cation, also holds. Clearly if D is a matrix such that DA is defined, then 
similarly we have 

(56) D(A + R) = DA -h DR . 

Observe, however, that if n > 1 and A and R are n-rowed square mat¬ 
rices, then IA -|- RI and | A | -f | R | are usually not equal. For example, 
if A and R are equal, then jAj -t- jRj = 2 \A \ while |A -t-Rj = \2A\ - 
2-lAl. 

Let us now apply our definitions to derive (51). We let A = (an) be an 
m by n matrix, R = (6/*) be an n by g matrix, and Ai be an s by < matrix, 
so that Ai = (an) but now with i = 1,. .. , s and j = 1,t. Then 
(51) has meaning only if Ri has t rows, and we thus assume that Ri is a 
matrix of t rows and g columns. Our partitioning is now completely de¬ 
termined and necessarily A 2 has s rows and n — t columns, Aj has m — s 
rows and t columns, A 4 has m — s rows and n — t columns, Rj has t rows 
and q — g columns, Ra has n — t rows and g columns, R4 has n — t rows 
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and q — g columns. The element in the ith row and fcth column of AB may 
clearly be expressed as 

n I n 

^^dijhjk = dijbjk 

i-i y-i y-<+i 

(i = 1, . . . , m; fc = 1 , . . ., g) . 
But this equation is equivalent to the matrix equation given by 

(57) A = (A A), R = , AB = AA + D 2 B 2 , 

where we define Di, Dt, Ei, Ei by 

(58) -D* =(!;). ■Ei=(A,A), A = (A,A). 


Moreover, we may obtain (51) from (57) by simply using the ranges 1, 2, 
. . . , s and 5 + 1, . . . , ^ for i separately, as well as 1, 2, . . . , g and 
g + If, . , y qfov j separately. In matrix language we have used (58) and 
computed 


(59) 


/p p 


AzB2/ ’ \-^ 4 / 




'AJBz 

^AJSz 


AzBA 

a,bJ 


as the result of partition of matrices and then have used (57) and addition 
of matrices in (59) to give (51). 

We shall now apply the process above to prove the following theorem on 
symmetric matrices mentioned above. 

Theorem 13. Let Ai and Bi be r-rowed nonsingular symmetric matrices 
with elements in F, and A and B be the corresponding n-rowed symmetric 
matrices 



of rank r. Then A and B are congruent in F if and only if Ai and Bi are 
congruent in F. 

For if Ai and Bi are congruent in F there exists a nonsingular matrix Pi 
such that PiAiP[ = Bi. Then P = diag {Pi, In-A is nonsingular, and com- 
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putation by the use of (61) gives PAP' = B. Conversely, if PAP' — B for 
a nonsingular matrix P we may write 



where Pi is an r-rowed square matrix, and we shall have 


P' = 



PAP' ^‘P 




0 ) \PiA,P', 


PiAiP'A 

P^A^P'J 


But then Bi = PiAiP[j and Pj must be nonsingular since |Pil = 
\P[\ • I ill I • I-Pi I ^ 0. Hence Bi and Ai are congruent in F. 

The result above thus states that, if / and g are quadratic forms with 
coefficients in F so that / and g are equivalent only if they have the same 
rank r, then/ may be written as a nonsingular form /o in r variables, g may be 
written as a nonsingular form go in r variables, and finally the original forms 
/ and g are equivalent in F if and only if /o and go are equivalent in P. 


1. Compute A + P if 
a) A = 



ORAL EXERCISES 


-1 3\ 

0 2, B = 
3 4/ 


1 7 -6\ 
-2 1 - 5 ) 
4 3 -1/ 


/-3 4 5\ /-3 5 3\ 

6) A = ( 5 -1 2), B = { 4 -1 l) 

\ 3 1 1/ V 5 2 1/ 

2. Verify that (A + B)' = A' + B' for any m by n matrices A and B. 

3. Show that every n-rowed square matrix A is expressible uniquely as the sum 
of a symmetric matrix B and a skew matrix C. Hint: Put A = B + CwithB = B', 
—C = C, compute A', and solve. 


16. Real quadratic forms. We shall close our study of sjmmetric mat¬ 
rices and hence of quadratic forms with coefficients in F as well by consider¬ 
ing the case where F is the field of all real numbers. Let then / = x'Ax 
have rank r so that we may take / = Oixf -h a^x? for real o,- ^ 0. 

We now call the number of positive the index toff and prove 

Theorem 14. Two quadratic forms with real coefficients are equivalent in 
the field of all real numbers if and only if they have both the same rank and the 
same index. 
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For proof we observe, first, that by Section 14 there is no loss of generality 
if we assume that / = x*Ax where A is a nonsingular diagonal matrix, that 
is, r — n. Moreover, there is clearly no loss of generality if we take A = 
diag [di, . . , y dty , —dr} for positive di. Then there exist real 

numbers Pi ^ 0 such that p? = and if P is the nonsingular matrix diag 
Ipi^y . . . , prM l^hen PAP' is the matrix diag {1, . . . , 1, — 1, . . . , — 1} 
with t elements 1 and r — t elements —1. Thus, / is equivalent in F to 

(61) (xf + . . . + X?) - (xf+i + . . . + X?) . 

Now, if g has the same rank and index as /, it is equivalent in F to (61) and 
hence to /. Conversely, let g have rank r = n and index s so that g is equiva¬ 
lent in F to 

(62) (x? + . . . + xj) - (xj+i + . . . + X?) . 

We propose to show that s ^ t There is clearly no loss of generality if we 
assume that s ^ t and show that if s > t we arrive at a contradiction. 

Hence, let s > L Our hypothesis that the form / defined by (61) and 
the form g defined by (62) are equivalent in the real field implies that there 
exist real numbers d*, such that if we substitute the linear forms 

(63) Xi = dnyi H- . . . + dinVn (i = 1, . . . , n) , 

in / = /(xi, . . . yxjy we obtain as a result (t/i + . . . + p?) - (yhi + 

. . . + 2/2). Put xi = X2 = . . . = Xt = 0 and p.+i = . . . = Pn = 0 in (63) 

and consider the resulting t equations 

(64) diiyi + . . . + di,!/, = 0 (i = 1, . . . , 0 . 

These are I linear homogeneous equations in s > t unknowns, and there 
exist real numbers Vi, . . . , v, not all zero and satisfying these equations. 
The remaining n — t equations of (63) then determine the values of xy as 
certain numbers u,- for j = t + ly n; and we have the result h = 

/(O, 0, . . . , 0, Ut+iy , . . y uj = vi + , . . + vi > 0, But clearly h = 

— (w?+i + . . . + wj) ^ 0, a contradiction. 

We have now shown that two quadratic forms with real coefficients and 
the same rank are equivalent in the field of all complex numbers but that, 
if their indices are distinct, they are inequivalent in the field of all real 
numbers. We shall next study in some detail the important special case 
< = r of our discussion. 
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A symmetric matrix A and the corresponding quadratic form / = x^Ax 
are called aemidefinite of rank r if A is congruent in F to a matrix 

(air 0\ 

VO o;’ 

for a 7*^ 0 in F. Thus / is semidefinite if it is equivalent to a form a{x\ + 

. . . + X?). We call A and / definite if r = n, that is, A is both semidefinite 
and nonsingular. 

If F is the field of all real numbers, we may take a = 1 and call A and / 
positive or take a = — 1 and call A and/negative. Then A and/ are nega¬ 
tive if and only if —A and —/ are positive. Thus we may and shall re¬ 
strict our attention to positive symmetric matrices and positive quadratic 
forms without loss of generality. 

If /(xi,. . . , Xn) is any real quadratic form, we have seen that there 
exists a nonsingular transformation (63) with real di^ such that / = 2/? + 

, . . + 2/i — (2/?+i + . . . + 2/?)- If Cl, ..., Cn are any real numbers and 
if we put Xi = Ci in (63), there exist unique solutions 2/j = dj of the result¬ 
ing system of linear equations, and the dy may readily be seen to be all zero 
if and only if the d are all zero. Now if < < r, we have/ < 0 for 2/1 = . . . = 
yt = 0, = l,/(ci, . . . , Cn) < 0. Conversely, if/(ci, . . . , Cn) < 0, then 

t < r. For otherwise / = 2/1 + • • • + 2/r, /(ci, . . . , Cn) = d? + . . . + 
d? ^ 0. If < == r < n, then we put ^r+i = 1 and all other 2/y = 0 and have 
Cl, . . . , Cn not all zero such that /(ci, . . . , Cn) = 0. Hence, if /(ci, . . . , 
Cn) > 0 for all real Ci not all zero, the form/is positive definite. Conversely, 
if / is positive definite, we have / = 2/1 + • • • + 2/m /(ci, ...,Cn)=d? + 

. . . + dj > 0 for all dy not all zero and hence for all c* not all zero. We 
have proved 

Theorem 16. A real quadratic form f(xi, . . . , Xn) is positive semidefinite 
if and only if f(ci,. . . , Cn) ^ 0/or all real Ci, is positive definite if and only 
if f(ci,. . . , Cn) > 0 for all real Ci not all zero. 

As a consequence of this result we shall prove 

Theorem 16. Every principal submatrix of a positive semidefinite matrix 
is positive semidefinite^ every principal submatrix of a positive definite matrix 
is positive definite. 

For a principal submatrix 5 of a symmetric matrix A is defined as any 
m-rowed S3rmmetric submatrix whose rows are in the iith,. . . , imth rows 
of A and whose corresponding columns are in the corresponding columns 
of A. Put Xq = (xi„ . . . , Xi„) so that g = XqBxq is the quadratic form with 
B as matrix, and we obtain g from / by putting xy = 0 in / for j 7 ^ ik- 
Clearly, if / ^ 0 for all Xi = c<, then ^ 0 for all values of the xy*, and 
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hence B is positive semidefinite by Theorem 15. If A is positive definite 
and B is singular, then g = 0 for Xij, not all zero, and hence / = 0 for the 
Xj above all zero and for the Xij^ not all zero, a contradiction. 

The converse of Theorem 16 is also true, and we refer the reader to the 
author^s Modern Higher Algebra for its proof. We shall use the result just 
obtained to prove 

Theorem 17. Let Abe an m by n mairix of rank r and with real elements. 
Then AA' is a positive semidefinite real symmetric matrix of rank r. 

For we may write 

“)o, W-(«;*), 


where P and Q are nonsingular matrices of m and n rows, respectively. 
Then QQ' is a positive definite symmetric matrix, and we partition QQ' so 
that Qi is an r-rowed principal submatrix of QQ'. By Theorem 16 the ma¬ 
trix Qi is positive definite. 


AA' 




--PCP' 


is congruent to the positive semidefinite matrix C of rank r and hence has 
the property of our theorem. 


EXERCISE 

What are the ranks and indices of the real symmetric matrices of Ex. 2, Sec¬ 
tion 11? 



CHAPTER IV 


LINEAR SPACES 

1. Linear spaces over a field. The set Vn of all sequences 

(1) u = (ci,.. . , c„) 

may be thought of as a geometric n-dimensional space. We assume the 
laws of combination of such sequences of Section 1.8 and call u a point or 
vector, of Vn- We suppose also that the quantities c» are in a fixed field F 
and call Ci the ith coordinate of u, the quantities Ci, . . . , Cn the coordinates 
of u. The entire set Vn will then be called the n-dimensional linear space 
over F. 

The properties of a field F may be easily seen to imply that 

(2) (u + v) + w — u + (v + w), u + V = V + u 

(3) a{bu) = {ah)u , (a + b)u ^ au + bu, a{u + v) == au + av 

for all a, 6 in F and all vectors u, v, w of F«. The vector which we designate 
by 0 is that vector all of whose coordinates are zero, and we have 

(4) w + 0 = w , u + (—w) = 0 , 
where —u = (—Ci,. . . , —Cn). Then 

(6) 0*w = 0, l-u = w, — w=—l‘W. 

Note that the first 0 of (5) is the quantity 0 of F, and the second zero is the 
zero vector. We shall use the properties just noted somewhat later in an 
abstract definition of the mathematical concept of linear space, and we leave 
the verification of the properties (2), (3), (4), and (5) of Vn to the reader. 

2. Linear subspaces. A subset L of Vn is called a linear suhspace of Vn if 
au + iw is in L for every a and b of F, every u and v of L. Then it is clear 
that L contains all linear combinations 

(6) U = aiUi + . . . + UmUm , 

for Oi in F, and Ui in L. 
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We observe that the set of all linear combinations (6) of any finite num¬ 
ber m of given vectors, wi, . . . , Wm is a linear subspace L of Vn according 
to this definition. If now L is so defined, we shall say that Uiy ... ,Um 
span the space L and shall write 

(7) L = [uif . . . , Wm} . 

It is clear that, if e,* is the vector whose jth coordinate is unity and whose 
other coordinates are zero, then ei, . . . , span Vn- 

The space spanned by the zero vector consists of the zero vector alone 
and may be called the zero space and designated by L = {0}. In what fol¬ 
lows we shall restrict our attention to the nonzero subspaces L of V., and, 
for the time being, we shall indicate, when we call L a linear space over F, 
that L is a linear subspace over F of some Vn over F. 

3. Linear independence. Our definition of a linear space L which is a 
subspace of Vn implies that every subspace L contains the zero vector. If 
L = [ui, ... f Utn\f then 0 = Oui + . . . + Oum- Hence the zero vector of 
L is expressible in the form (6) in at least this one way. We shall say that 
uij ... f Um are linearly independent in F or that Wi, . . . , Wm are a set of 
linearly independent vectors (of Vn over F), if there is no other such expres¬ 
sion of 0 in the form (6). Thus, Ui, ... ,Um are linearly independent if it 
is true that a linear combination aiUi + . . . + OMn = 0 if and only if 
ai = a 2 = ... = cu = 0. If wi, . . . , Ww are not linearly independent in F, 
we shall say that they are linearly dependent in F. 

A set of vectors wi, . . . , tu are now seen to be linearly independent in F 
if and only if the expression of every u o/ L = {ui, . . . , Um} the form (6) 
is unique. For this property clearly implies linear independence as the spe¬ 
cial case w = 0. Conversely, if wi, . . . , Um are linearly independent and 

u = aiui + . . . + a„^Um = biui + . . . + b„^Umy then 0 = (ai — bi)ui + 

. . . + (Um — bm)umf Oi — bi = 0, u* = as desired. We now make the 

Definition. Let L = {ui,..., Um} over F and Ui,..., Um be linearly inde¬ 
pendent in F. Then we shall call Ui, . . . , Uni o basis over F of L and indicate 
this by writing 

{S) L = UiF + . . . + UmF. 

It is evident that Vn = eiF + . . . + OnF. But we may actually show 

that every subspace L spanned by a finite number of vectors of Vn has a 
basis in the above sense. Observe, first, that the definition of linear inde- 
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pendence in case m = 1 is that aw = 0 only if a = 0 and thus that w 5 ^ 0 . 
Then let Wi, . . . , Wr be linearly independent vectors of Vn and w 5 *^ 0 be 
another vector. Then either ui, , Ur, u are linearly independent in F or 
aiWi + . . • + cirUr + aou = 0 for a* not all zero. If ao = 0, then WiWi + 
. . . + OrUr = 0, from which Wi = . . . = Wr = 0, a contradiction. But then 
u = (—aj‘^ai)wi + . . . + (—is in {wi, . . . , w^}. It follows that, 
if wi, . . . , Um are aiiy m distinct nonzero vectors, we may choose some 
largest number r of vectors in this set which are linearly independent in F 
and we will then have the property that all remaining vectors in the set are 
linear combinations with coefficients in F of these r. We state this result as 
Theorem 1. Let L be spanned by m distinct nonzero vectors Ui, . . . , Um. 
Then L has a basis consisting of certain r of these vectorSj 1 < r < m. 


EXERCISES 

1 . Determine which of the following sets of three vectors form a basis of the 
subspace they span. Hint: It is easy to see whether some two of the vectors, say 
Wi and are linearly independent. To see if wi, 2 ^ 2 , uz are linearly dependent we 
write Uz = xu\ + yuz and solve for x and y. 


a) u\ — (1, 3, —3), 

b) wi = (3, 1, 2) , 

c) Ui = (-1, -1, 1) , 

d) Wi = (1, -2, -1, 3) 

e) ui = ( 1 , 1 , 1 , - 1 ), 

/) Wi=(l, -1, 1,-1) 


W 2 = (2, 5, — 2), 

W 2 = (4, 1, 3), 

W 2 = (3, 2, 1 ), 

W 2 = (2, —1, 1, 6) , 
W 2 = (2, 2, 2, — 2), 
U2 = ( 1 , 1 , 1 , 0 ) , 


W3 = (1, 1, 6) 

W3= (-1, -1,0) 

W 3 = (7, 3, 9) 

W 3 = (0, —1, —3, 1) 
W 3 = (1, 2, 3, 4) 

Uz = (5, — 1, 5, —3) 


2. Show that the space L = {wi, W 2 , W 3 } spanned by the following sets of vectors 
is Vz. Hint: In every case one of the vectors, say wi, has first coordinate not zero, 
and L contains Ui — xiui = ( 0 , 62 , C 2 ), uz — a: 3 Wi = ( 0 , 63 , C 3 ). Some linear combina¬ 
tion of these two quantities has the form (0, 0 , d) and L contains 63 = ( 0 , 0 , 1 ). It 
is easy to show then that L contains 62 = ( 0 , 1 , 0) and Ci = ( 1 , 0, 0), L = Fs. 

а) ui * ( 1 , -2, 3), W 2 == (2, 3, 1), W 3 = (-1, 3, 2) 

б ) wi = (0, 1, 8 ), W 2 - (1, -3, 6), W 3 = (1, -1, 23) 

c) wi = (0, 3, 2), W 2 = (0, 2, 1 ), W 3 = (1, 5, 4) 


3. Determine whether or not the spaces spanned by the following sets of vectors 
Wi, coincide with those spanned by the corresponding ri, Vz. Hint: If Li = 
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{ui, tt:} = Li = {vi, v:), there must exist solutions xi, xi, X|, X4 of the equations 
wi = xiMi + XtUi, Vi = X3M1 + X4Uj such that the determinant X1X4 — xjXj ^ 0. 


o) Ml = (3, -1, 2, 1), M2 = (1, 4, 6, 1), 

V, = (7, -11, -6, 1), = (7, 2, 10, 3) 

6) M, = ‘(1,2,-1,2), m4 = (1, 2, 3, 4), 

vi = (1, 2, —13, -4), Vi = (0, 0, 4, 1) 

c) Ml = (1, -1, 2, -3), Mj = (2, -2, 4, -6), 

»i = (3, -3, 6, -9), Vi= (4, -4, 8, -9) 

d) Ml = (1,-1,0,3), M3 = (2, 1, 0, 1), 

«;i = (-1,-2, 1,2), = (2,-2, 0, 6) 

e) Ml = (1, -2, 0, 1), Mi = (2, -1,1, 0), 

t;i = (3,-3, 1,1), ri= (-1,-1,-1,1) 


/) Ml = (1, 0, 1,-2), Mi = (0,1,-1,2), 

«;i = (2, -2, 4, -8), t>i = (1, 1, 0, 0) 

4. The row and column spaces of a matrix. We shall obtain the principal 
theorems on the elementary properties of linear spaces by connecting this 
theory with certain properties of matrices which we have already derived. 
Let us consider a set of m vectors, 

(9) Ui = (o<i, . . . , o<„) (f = 1 , . . ., m) , 

of Vn over F. Then we may regard ut as being the I'th row of the correspond¬ 
ing wi by n matrix A = (a<y) and the space L = {ui, . . . , Um] as being 
what we shall call the row space of A. Thus every m by n matrix defines a 
linear subspace of Vn, every subspace of Vn spanned by m vectors defines 
a corresponding m by n matrix. 

If P = (pt.) is any g by m matrix, the product PA is a 9 by n matrix. 
The jth coordinate of the vector 

(10) Wk = PklUi + . . . -H PkmUm 

is p*iOi; PkmOmi, that is, the element in the A:th row and jth col¬ 

umn of PA. Hence the kth row of PA is that linear combination of the rows 
of A whose coefficients form the kth row of P. 

We have now shown that every row of PA is in the row space of A. It 
follows that the row space of PA is contained in that of A. If P is nonsingu- 
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lar, then the result just derived implies that the row space of A = F'^iPA) 
is contained in the row space of PA and therefore that these two linear 
spaces are the same. Thus we have 

Lemma 1 . Let P be an m-rowed nonsingular matrix. Then the row spaces 
of PA and A coincide. 

In the proof of Theorem 3.6 we used the matrix product equivalent of 
the elementary transformation Theorem 2.4, and we shall now state this 
theorem as the useful 

Lemma 2. Let A 6 c an m 6 i/ n matrix of rank r. Then there exist nonsingular 
matrices P and Q o/ m and n rows^ respectively, such that 

(11) = AQ=(H 0), 

where G and H have rank r, G is anrby n matrix, H is anmhy t matrix. 
We use (11) and note that the rows of G differ from those of PA only in 
zero rows. Then the rows of G span the row space of PA, By Lemma 1 
we have 

Lemma 3. The row spaces of G and A coincide. 

We shall use this result in the proof of 

Theorem 2. The r rows of G form a basis of the row space of A. Any 
r + 1 vectors of the row space of A are linearly dependent in F. 

For we may designate the rows of G by vi, . . . , Vr, Our definition of G 
implies that there is no loss of generality if we permute its rows in any de¬ 
sired fashion. Thus, if biVi brVr = 0 for bi in F not all zero, we 

may assume for convenience that 6 i 9 ^ 0. The determinant of the r-rowed 
square matrix 

(12) ^={p)' P2=(0,I,_x) 

is clearly 61 , and hence P is nonsingular. Then PG is r-rowed and of rank r. 
But this is impossible since biVi + . . . + brVr = 0 is the first row of PG, 
Hence the rows of G are linearly independent in F, they span the row space 
of G and of A, they are a basis of the row space of A, 

Assume, now, that Wk = bkiVi + . . . + bkrVr for fc = 1 , . . . , r + 1 , so 
that Wk are any r -H 1 vectors of the row space L = viF + . . . -f VrF of A, 
Define B = Q)kd and obtain Wk as the A;th row of the r + 1 by n matrix 
BG. By Theorem 3.6 the rank of BG is at most r, and by Lemma 2 there 
exists a nonsingular (r + l)-rowed matrix D == {dgk) for g, fc = 1, . . . , 
r + 1 , such that the (r + l)st row of D{BG) is the zero vector. But then 
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dr+i, iw^i + . . . + dr+i, r+iW^r +1 = 0, thc df+i,* forni a row of the nonsingu¬ 
lar matrix D and cannot all be zero, the vectors Wi, . . , j Wr+i are linearly 
dependent in F. This proves Theorem 2. 

We now apply Theorem 1 to obtain 

Theorem 3. The matrix G of Lemma 2 may he taken to be a submatrix of A. 
The integer r of Theorem 1 is in fact the rank of the mby n matrix whose \th 
row is Ui. 

For let A be an m by n matrix whose ith row is Ui and let r be the integer 
of Theorem 1, the rank of A be s. After a permutation of the rows of A, if 
necessary, we may assume that wi, . . . , iir are a basis of the row space of A. 
Then Uk = bk\Ui + . . . + hkrUr for bki in F, and fc = r + 1, . . . , m. The 
matrix P given by 

(13) P = B = 

is nonsingular, and it is clear that if 


(14) 


«i 


G = 


} 


Ur 


then PA is given by (11). It follows that s is the rank of G, s < r. If s < r 
we apply Lemma 2 to obtain a nonsingular r-rowed matrix D = (dgk) such 
that the rth row of DG = 0, the rth row of D is not zero, driUi + . . . + 
drrUr = 0 Contrary to our hypothesis that wi, . . . , Wr are linearly inde¬ 
pendent. This completes our proof. 

We may now obtain the principal result on linear subspaces of Vn> 

Theorem 4. Every linear suhspace L over F of Vn over F has a basis. Any 
two bases of L have the same number r < n of vectors, and we shall call r the 
order of L over F. 

For Fn = eiP + . . . + CnP, and by Theorem 2 any n + 1 vectors of 
Vn are linearly dependent in F, Thus any linear subspace L over F of Vn 
contains at most n linearly independent vectors. It follows that there 
exists a maximum number r < n of linearly independent vectors ui, . . . , Ur 
in L, and that w, Wi, . . . , Ur are linearly dependent in F for every u of L. 
By the proof of Theorem 1 the vector u is in {wi, . . . , Wr}, L = uyF + 
. . . + UrF. But if also L = ViP + . . . + v,F, then Theorem 2 implies that 
8 < r and similarly that r < s, r == s is unique. 
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In closing this section we note that the rows of the transpose A' of A 
are uniquely determined by the columns of A and are, indeed, their trans¬ 
poses. Ilius we shall call the row space of A' the column space of A. It is 
a linear subspace of Vm. We call the order of the row and column spaces 
of A, respectively, the row and column ranks of A. By Theorem 3 the rank 
of A is its row rank. Also A and A' have the same rank and we have proved 
Theorem 6. The rorjd and column ranks of a matrix are equal to its rank. 


EXERCISES 

1. Solve Ex. 1 and 2 of Section 3 by the use of elementary transformations to 
compute the rank of the matrices whose rows are the given vectors. 

2. Form the four-rowed matrices whose rows are the vectors wi, 1 ^ 2 , wi, V 2 of Ex. 3, 
Section 3. Show thus that Li == {ui, U 2 } = L 2 = {vi, ^ 2 } if and only if the ranks 
of the corresponding matrices A are equal to the order of the subspace Li, that is, 
the rank of the matrices formed from the first two rows of each A. 

3. Find a basis of the row space of each of the following matrices, the basis to 
consist actually of rows of the corresponding matrix. 


a) 


1-2 10 
2-4 2 0 

2-3-1 1 
4-7 11 

2-3-1 1 


b) 


2 3 0 1 

-2 -1 -1 1 
-2-10 3 

0 4-12 

2 12-2 


c) 


-10 0 1 
0-11 1 
2 3 0 -1 

5 10 0 

12 3 4 


d) 


12-1 3 4 

3 5 1 0 2 

11 3-6-6 

4 7 0 3 6 

1 0 7 -15 -16 


4. Let A be a rectangular matrix of the form 



where Ai is nonsingular. Show that then there exists a nonsingular matrix P such 
that 



Give also a simple form for P. Hint: Show that the rows of As are in the row space 
of Ai. 



LINEAR SPACES 


73 


5. Let the matrix of Ex. 4 be either sjrmmetric or skew. Show that the choice 
of P then implies that 



Show also that, if the order of Ai is the rank of A, the matrix A^ = 0. 

6. The concept of equivalence. In discussing the properties of mathemati¬ 
cal systems such as fields and linear spaces over F it becomes desirable 
quite frequently to identify in some fashion those systems behaving exactly 
the same with respect to the given set of definitive properties under con¬ 
sideration. We shall call such systems equivalent and shall now proceed to 
define this concept in terms of that of function. 

Let G and G' be two systems and define a single-valued function f on G 
to G'. In elementary mathematics it is customary to call G the range of 
the independent variable and G' the range of the dependent variable. How¬ 
ever, it is more convenient in general situations to say that / is a function 
on G to G' or that / is a correspondence on G to G\ Then / is given by 

(15) g-^g' = fig) 

(read g corresponds to g') such that every element g of G determines a unique 
corresponding element g' of G'. In elementary algebra and analysis the sys¬ 
tems G and G' are usually taken to be the field of all real or all complex 
numbers and (15) is then given by a formula y = f(x). But the basic idea 
there is that given above of a correspondence (15) on G to G'. This concept 
may be seen to be sufficiently general as to permit its extension in many 
directions. 

Suppose now that (15) is a correspondence such that every element g' of 
G' is the corresponding element f(g) of one and only one g of G. Then we 
call (15) a one-to-one correspondence on G to G'. It is clear that (15) then 
defines a second one-to-one correspondence 

(16) fijg) = g'-^g, 

which is now on G' to G, and we may thus call (15) a one-to-one correspond¬ 
ence between G and G' and indicate this by writing 

(17) g^g'- 

Note, however, that, if G and G' are the same system, the functions (15) 
and (16) are, in general, distinct. Thus we may let G be the field of all real 
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numbers and (15) be the function x 2x, so that (16) is the function 
X — > ix. Of course, if G and G' are distinct it is not particularly important 
whether we use (15) or (16) to define our correspondence. 

We proceed to use the concept just given in constructing the fundamental 
definition of this section, that of equivalence. Let us consider two mathe¬ 
matical systems G, G' of the same kind such as two fields or two linear 
spaces over a fixed field F. These systems consist of sets of elements g, h, 

. . . closed with respect to certain operations. (Thus, for example, we might 
have ^ + A in G for every g and hin G and also g' + A' in G' for every g' 
and h' in G'.) We then call G and G' equivalent if there exists a one-to-one 
correspondence between them which is preserved under the operations of 
their definition. We now see that we have defined two fields F and F' to 
be equivalent if there exists a one-to-one correspondence (15) between them 
such that {g + hy = g' + h\ (gh)' = g'h' for every g and h of F, Let us 
then pass to the second case which we require for our further discussion of 
linear spaces. 

Let F be a fixed field and V consist of a set of elements such that u + v 
and au are unique elements of V for every u and i; of 7 and a of F. Then 
we shall call V a general linear space over F. If 7o is a second* such space 
and there is a one-to-one correspondence u <— >uo between V and Vo such 
that 

(w + v)q = Uo + Vo , (ow)o = UUo 

for every u and t; of 7 and a of F, then we shall say that 7 and 7o are equiva¬ 
lent over F. We have thus introduced two instances of what is a very im¬ 
portant concept in all algebra. 

The reader should observe that under our definition every mathematical 
system G is equivalent to itself and that if G is equivalent to a system G', 
then G' is equivalent to G. Finally, if G' is equivalent to G", then G is 
equivalent to G". 


EXERCISES 

1. Verify the statement that the field of all rational functions with rational coeffi¬ 
cients of the complex number ^72 is equivalent to the field of all matrices 



for rational a and b under the correspondence A <—> a+ b^2, 

* We use the notation 7o instead of V' to avoid confusion in the consequent usage 
of li' for the arbitrary vector of V* as well as for the transpose of u. 
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2 . Verify the statement that the field of complex numbers is equivalent to the 
field of matrices 

for real o and 6. 

3 . Verify the statement that the set of all scalar matrices with elements in a field 
F forms a field equivalent to F. 

4 . Verify the statement that the mathematical system consisting of all two-rowed 
square matrices with elements in F and defined with respect to addition and multi¬ 
plication is equivalent to the set of all matrices 


an 

0 

012 

O' 

0 

an 

0 

012 

Oil 

0 

022 

0 

,0 

021 

0 

022, 


{an in F) 


under the correspondence indicated by the notation. 


6. Linear spaces of finite order. We shall restrict all further study of 
linear spaces to linear spaces of order n over F, Then we define F to be a 
linear space of order n over a given field F if F is equivalent over F to Fn 
over F, Clearly, our definition implies that every two linear spaces of the 
same order n over F are equivalent over F. 

Our definition also implies that, if F is a linear space of order n over F, 
then the properties (2), (3), (4), and (5) hold for every w, y, Wj of F and 
a, b in F. Moreover, every quantity of F is uniquely expressible in the form 

(18) CiVi + . . . + CnVn , 

where the equivalence between F and Fn is given by 

(19) CiVi + . . . + CnVn <-► (Ci, . . . , Cn) . 


Conversely, define au = ua and u + w to be in F for every w, v of F and 
a of F, Then it may be shown very easily that, if (2), (3), (4), and (5) hold 
and every u of F is uniquely expressible in the form (18) for c< in F, then F 
is equivalent over F to Fn. However, we prefer instead to define V by its 
equivalence to Fn. This preference then requires the (somewhat trivial) 
proof of 

Theorem 6. Every linear subspace L of order r over F of Vn is equivalent 
over F to Vp. 

Thus we justify the use of the term linear subspace L of order r over F by 
proving that L is indeed what we have called a linear space of order r over F 
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contained in the space F«. For proof we merely observe that every vector, 
of L = U\F + . . . + UrF is uniquely expressible in the form 

(20) U = CiUi + . . . + CfUr 

for Ci in F. Thus u in L uniquely determines the c< in F and conversely. 
It follows that 

(21) (C, . . . , Cr) 

is a one-to-one correspondence between V and Vr and it is trivial to verify 
that it defines an equivalence of L and F,. 

We have now seen that every linear space L of order n over F may be 
regarded as a linear subspace of a space M of order m over F for any 
m > n. Moreover, L = M i/ and only i/ m = ir. 

Theorem 3 should now be interpreted for arbitrary linear spaces of order 
n, and we have a result which we state as 

Theorem 7. Let L = uiF + UnF and vi, . . . , v^ be in L so that 

there exist quantities a,/ in F for which 


Vi — aiiUi -f* . . . ainUn , 

and the coefficient matrix A = (aij) is defined. Then the number of the Vk 
which are linearly independent in F is the rank of the matrix A. Moreover, 
Vi, . . . , Vm form a basis of L over F if and only ifm = n and A is nonsingular, 

EXERCISES 

1 . Verify the statement that the following sets of matrices are linear spaces of 
finite order over F and find a basis for each. 

o) The set of all m by n matrices with elements in F, 

b) The set of all m by n matrices whose elements not in the first row are zero. 

c) The set of all n-rowed scalar matrices. 

d) The set of all n-rowed diagonal matrices. 

2 . Find bases for the following linear spaces of polynomials with coefficients in F, 

a) All polynomials in x of degree at most three. 

b) All polynomials in independent variables x and y and degree at most two 
(in X and y together). 

c) All polynomials in x = t^ + t and y t^ + t^ and degree at most two in 

X and y, _ 

d) All polynomials in ^2, with F the field of all rational numbers. 

e) All polynomials in a primitive cube root of unity with F the field of all 
rational numbers. 
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/) All polynomials in it? = u(i + 1) with F the field of all rational numbers 
and = — 1, = 2. Hint : Prove that 1, u, i, ui are a basis. 

g) The polynomials of (/) but with F the field of all real numbers. 

3. Let A = diag {1, —1, 2}. Show that /, A, A* form a basis of the set of all 
three-rowed diagonal matrices. 

4. Show that /, A, AB form a basis of the set of all two-rowed square mat¬ 
rices if 


5. Show that, if 




0 0 0\ /O 1 0\ 

0-10), (0 0 l), 

.0 0 1 / \1 0 0 / 


then I, A, A*, B, AB, A^B, B^, AB^, A^B^ form a basis of the set of all three-rowed 
square matrices. 

7. Addition of linear subspaces. If Li - {vi, , Vm] and Lj = 
[wi, . . . , Wq\ are linear subspaces over Z'’ of a space L of order n over F, 
the subspace Lo = [vi, . . . , Wi, ., Wq\ ol L will be called the sum of 
Li and and will be designated generally by 

(22) Lo = {Li, Z/ 2 } • 

If the only vector which is in both Lx and L 2 is the zero vector, we shall say 
that Lx and L 2 are complementary subspaces of their sum and write 


Lo = Li + 1/2. 


In this case the order of Lo is the sum of the orders of Lx and L2 and in fact 
we shall show that, if Lx = VxF v^F, L 2 = WxF + . . . + WqF, 

then Lo = -f . . . -f VmF + WxF -f- . . . + w^F. 

For it is clear that axVx + . . . + OmVm + bxWx + . . . + bqWq = 0 if and 
only if t; = axVx + . . . + dmVm = {—bx)wx + . . . + { — bq)wq is in both Lx 
and L 2 . Thus v ^ 0 implies that the and hj are not all zero and therefore 
that the vectors Vi and Wj spanning Lo are linearly dependent in F and do 
not form a basis of Lo. Conversely, if necessarily t; = 0, then the Vi and Wj 
do form a basis of Lo, and Lo has order m + q. 

If Lx and Lo are linear subspaces of L and if Lo contains Li, we may ask 
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whether a linear subspace L 2 of Lo exists such that Lo = Li + L 2 . The 
existence of such a space is clearly a corollary of 

Theorem 8. Let Li of order m over ¥ he a linear suhspace of L of order n 
over F. Then there exists a linear suhspace L 2 of L such that Li and L 2 are 
complementary suhspaces of L. 

The result above may be proved by the method we used to prove Theo¬ 
rem 1 , where we apply this method to the set Vi, . . . , Vm, Wi, . . . , Un, in 
L = UiF + . . . + UnF. However, let us give, instead, a proof using matrix 
theory. We put L = F», let G be the m by n matrix whose rows are the basal 
vectors Vi, . . . , of Li. Then G has rank m, and there exists a nonsingular 
matrix Q such that the columns of GQ are a permutation of those of G, 

GQ = ((?! G2), 

where 61 is nonsingular. Then the matrix 



is nonsingular, and A = AqQt^ is obtained by permuting the columns of Ao. 
But then 

is nonsingular, the rows of A span F„, and the rows of H span the space Li 
which we h&ve been seeking. Moreover, it is clear that the rows of H are 
certain of the vectors which we defined in Section 2. 


EXERCISES 


1. Let Li be the row space of each of the w by n matrices of Section 4, Ex. 3. Use 
the method above to find a basis of the corresponding Vn consisting of a basis of 
Li and of a complementary space Li. 

2. Let the following vectors m span Li, vt span In. Find a complement in { la, Li] 
to Li and to Li. 


lui = ( 1 , - 1 , 1 , 1 ), 

\ n = (1, 2,1, 0), 

Ml = (1, 2, 0, 0) , 
t-i - (4, 3,1, 1), 

Ml = (1, 0 , 2 , —1) , 

= (3. -2, 2, -1) , 


Mj = (2, -2, 1, 2), 
t-, = (4, -1, 4, 0) 

Ui = ( 1 , - 1 , 1 , 0 ) , 

Vi = (4, —3, 3, 2), 

Ui = ( 0 , 1 , 2 , - 1 ) , 

Vi «= (~5, 3, —4, 2), 


M,= (1, -1,0,1) 


u, = ( 1 , 0 , 0 , 1 ) 

= (1, 4, 2, -3) 

Us = (2, 1, 6,, —3) 

t>, = (-2, 1, -2, 1) 



LINEAR SPACES 


79 


8. Systems of linear equations. The set of all linear forms in xi,. . . , Xn 
with coefficients in F is a linear space 

(24) L XiF + . . . + XnF 
of order n over F, The left members 

(25) fi — • • • > ^n) “ "!“••• “i” CLin^n 


of a system 

(26) OfiiXi “f" • . • “4” OiinXn ~ C* (i = 1, . . . , 17l) y 

of m linear equations in n unknowns, are such forms and are in L. Then, 
if r is the rank of the m by n matrix A = (Oi,) of coefficients of (26), we 
see by Theorem 7 that certain r of the forms/< are linearly independent in 
F and the remaining m — r forms are linear combinations of these r. 

We may assume without loss of generality that the equations (26) have 
been labeled so that /i, . . . , /r are linearly independent in F, and 

(27) ik = 5fcl/l + . . . + &A:r/r (fc = T + 1, • • • > w) 


for bkj in F. Then 


Ui 


(28) A = 


\ 


V/i (Uil, . • ■ , U<n) 

(z = 1, . . . , m) , 


where wi, . . . , Wr are linearly independent in F, and . 

(29) Uk ~ bklUfl ”4“ • • • "4“ bkr^r (J^ ~ r "4“ 1, . . . , fl%) • 

If the system (26) is consistent, there exist quantities di, . . . , dn in F such 
that /i(di, . . . , dn) = Ct. Then (27) implies that 

Ck = fk{diy . . . , dn) = bklCi + . . . + bkrCr 

(fc = r + 1, . . . , m) . 


(30) 



80 


INTRODUCTION TO ALGEBRAIC THEORIES 


Define the augmented matrix A* of the system (26) to be the nt by n + 1 
matrix 

' uj ' 

(31) A* = . , = («,, Cj) = (Oil, ...,«<«,<%) 

i < I = 1,... ,m), 

and see that (29) and (30) imply that 

(32) ut = bkiul + . . . + bkrU* (fc = r + 1, . . . , m) , 

so that the rank of A* is at most r. But A* has -4 as a submatrix; A* has 
rank r. 

Conversely, if A* has the same rank r as A and we choose Wi, . . . , Wr to 
be linearly independent, then uj, . . . , uj are clearly linearly independent. 
We then have (29) and may apply elementary row transformations of type 
1 to A* which add — (bjbiwJ + . . . + bkrU*) to w* for fc = r + 1, . . . , m. 
These replace the submatrix A of A* by 

' Ui 

( 33 ) (®). 0 . : 

and replace A* by 


where the (m — r)-rowed and one-columned matrix C 2 has Cko = Ck — 
(bkiCi + . .. + bkrCr) as its elements. But, clearly, if any cjbo ^ 0, the 
matrix A* has a nonzero (r + l)-rowed minor. This is impossible if A* is 
of rank r. 

We now see that the system {26) is consistent if and only if A* has the 
same rank t as A. Moreover, we have already shown that, if A* does have 
the same rank as A, then m — r of the equations (26) may be regarded as 
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linear combinations of the remaining r equations and are satisfied by the 
solutions of these equations. It thus remains to show that any system of r 
equations in xi, . . . , Xn with matrix of rank r has solutions. We may write 
X = (xi, . . ^, Xn) and see that such a system may be regarded as a matrix 
equation 

(35) CrX' = V' , V = (Ci, . . . , Cr) . 

Before solving (35) we prove 

Lemma 4. Let G be an t by n matrix of rank r. Then 

(36) G = (L 0)Q 
for a nonsingular n~rowed matrix Q. 

For Lemma 2 states that G = (H 0)Qi, where Qi is n-rowed and non¬ 
singular and is an r by r matrix of rank r. Then H is nonsingular, so is 
Q 2 = diag {Hy /n-r}, (Ir 0)Q2 = (H 0). Thus we have (36) for Q = 
Q 2 Q 1 , 

The system (35) may now be written as 

(37) {Ir 0)t/' = v', 2 / = xQ' = ( 2 / 1 , . . . ,yn). 

But then y = xQ' is a nonsingular linear transformation and the yi are 
linearly independent linear forms in Xi, . . . , Xn. Evidently, (37) has the 
solution 2 /i = Ci for i = 1, . . . , r; the solution of (35) is then given by 
X = 2/(Q')"^ for 2 /i = Ci{i = 1, . . . , r) and yr+h • • • iVn arbitrary. Ob¬ 
serve that, if we choose the notation of the x* so that G = {G\, G^ with 
(?i an r-rowed nonsingular matrix, then 

(38) Q = (o' y = (.YuY,), x = (Xx,X,), 

where Yi and Xi have r columns. From this we obtain 

Q' = (| 5^ ), xQ'= (XxG'x + X*G5, X,) , 


SO that 


X2 — Y2 — (Xf+l, • • • ; Xfi') y 

Y, = X,G[ + = (ci,. . . , c,), 
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and our solution of (35) is given by 

(39) X, (Y, ^ X,Gi){G[)-^. 

But then we have solved for xi,. . . , as linear functions of Xr+i, . . . , Xn. 
For exercises on linear equations we refer the reader to the First Course in 
the Theory of Equations. 

9. Linear mappings and linear transformations. The system of equations 
(3.1) of a linear mapping was expressed in (3.32) as a matrix equation 
x' = 2 /'A'. Let us interchange the roles of m and n, A and A' in this equa¬ 
tion. Then we see that a linear mapping may be expressed as a matrix 
equation 

(40) V = wA , 
where A is an m by n matrix and 

(41) M = (j/i, . . ., J/m), » = (afi, . . , , a:„) . 

Clearly, w is a vector of Vm over F, e; is a vector of Vn over F, and (40) may 
be regarded as a correspondence u —> wA defined by A whereby every u 
of Vm determines a unique vector uA of Vn- We now proceed to formulate 
this concept more abstractly. 

Let L and M be linear spaces of respective orders m and n over F and 
consider a correspondence on L to M. Designate the correspondence by the 
symbol iS, so that S is the function 

S: u— 

{read u goes to u upper S) wherein every vector uoiL determines a unique 
u^ in M. Suppose also that 

(42) (au + hu^Y = au'® + hv^ 

for every a and 6 of F, u and uo of L. Then we shall call S a linear mapping 
of the space L on the space M and describe (42) as the property that S is 
linear. 

Suppose now that L = uiF + . . . + UmF and M = viF + . . . + t;»F, 
so that we are given not only the spaces L and M but fixed bases as well. 
Then a linear mapping S uniquely determines uf in M, and hence 

(43) uf = OixVi + . . . + OinVn (i = 1, . . . , m) , 
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for a »7 in F, Thus S determines also an m by n matrix A = (a,-,). But, con¬ 
versely, if A is an m by n matrix and we define w? by (43) for the given ele¬ 
ments o<y of Ay then the property that S is linear uniquely determines the 
mapping S. This is true since every w of L is uniquely expressible in the 
form 

(44) u = yxUi + . . . + , 

for 2/i in F and (42) implies that 

(45) = 2/iwf + . . . + y^vS ,. 

It follows that to every linear mapping S of L on M and given bases of L 
and My there corresponds a unique m by n niatrix A and conversely. We 
shall call A the matrix of S vrith respect to the given bases of L and M. 

We now observe that 


tn m I n \ n 

* = 1 i^i \y=i / ; = i 

where 

m 

Xi='^yiaii (j = l, 

t-1 

But then, if we assume temporarily that L = Vm and M = Vn and put 
v = in (40) and (41), we see that S is the linear mapping 

(46) M —» w® = uA 

for the given matrix A, Thus every linear mapping which is a change of 
variable as in (3.1) may be regarded as a linear mapping of the space Vn 
on the space Vn- 

Let us next observe the effect on the matrix defined by a linear mapping 
of a change of bases of the linear spaces. Define new bases of L and My 
respectively, by 


4®^ =^PkiUi, 
»- 1 


i-i 


(47) 
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for fc = 1,. . ., m and I = 1, ...»n. Then P = (pti) and Q = (gi,) are 
nonsingular and, as we saw in (3.33), we may also write the second set of 
equations of (47) in the form 

n 

(48) ' 

j-i 

where R = (r,i) = We apply the linearity of S to (47) and obtain 

m 

(m^®^)® = Substituting (43) and (48), we have 

> -1 

n 

(49) (wi®’)® , 

1-1 


where bki = ^Pkidarji, Hence the matrix B = (bki) of the linear mapping 
S with respect to our new bases is given by 

(50) B = PAQr^ . 

Since P and are arbitrary nonsingular matrices of m and n rows, re¬ 
spectively, we see that changes of basis in L and M replace A by an equiva¬ 
lent matrix. Thus any two equivalent m by n matrices define the same 
mapping of L on M, 

If L and M are the same space we shall henceforth call a linear mapping 
S of L on L a linear transformation of L. Since we are now considering only 
a single space, the only possible meaning of the Ui and Vj in (43) can be that 
of a fixed basis Wi, . . . , Wn of L of order n over F and of a second basis 
Vi, ... fVn of L, Let us restrict our attention to the case where Vi = 
Then we define the matrix A of a linear transformation S on L with respect 
to a fixed basis Wi, . . . , Wn of L to be the matrix A determined by (43) 
with Ui = Vi, We have defined thereby a one-to-one correspondence be¬ 
tween the set of all n-rowed square matrices with elements in F and the set 
of all linear transformations on L of order n over F, If L = V„, such a 
correspondence is given by 

u—^u^ — uA, 

Clearly, we should and do call S a nonsingular linear transformation if A is 
nonsingular. Since we may then solve for u = we see that a non¬ 

singular linear transformation defines a one-to-one correspondence of L 
and itself. 
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We now observe the effect on A of a change of basis of L. We use (47) 
but note that and = vi^\ so that we have P = Q, Hence a 

change of basis* of L with matrix P replaces the matrix A of a linear trans¬ 
formation by 

(51) B = PAP-i . 

It is now clear that in order to study those properties of a linear trans¬ 
formation on L which do not depend on the basis of L over F we need only 
study those properties of square matrices A which are unchanged when we 
pass to PAP“K We shall call two matrices A and B similar if (51) holds 
for a nonsingular matrix P and shall obtain necessary and sufficient condi¬ 
tions that A and B be similar in our next chapter. 

EXERCISES 

1. Let S be the linear mapping (46) of Fs on defined for the following mat¬ 
rices A. Find the vectors of V 4 into which (—2, 3, 4), (1, 0, 0), (0, 1, 0) of F* are 
mapped by <S. 

/ 1 -3 0 1\ /-2 3 4 0\ 

a) A = ( 2-6-1 2 ) 6)A = (-1 2 0 4] 

\-l 3 1 - 1 / \ 0 0 2 - 3 / 

/2 -1 2 -5\ /-2 0 0 0\ 

c) A = 0 -2 3 - 2 ) d)A = l 0 - 100 ) 

\1 1 -1 -1/ \-l 1 1 1/ 

2. Show that the linear transformations (46) of Fa defined for the following mat¬ 
rices A are nonsingular and find their inverse transformations. Apply both S and 

to the vectors of Ex. 1. 

/-4 3 5\ /3 -1 -4\ 

a) A = 3 0 20 ) 6) A = 5 -2 - 1 ) 

\ 5 -2 11 / \2 -1 2 / 

3. Define S for the matrices of Ex. 2 and let C be one or the other of the curves 
of all vectors (points) u = (xi, X 2 , X 3 ) whose coordinates satisfy the following equa¬ 
tions. Find the equation of the curves into which each S carries each C. 

a) 3x? — 2x1 + 2xf + 4 xiX 2 ~ 2xiXs — 2xaX8 = 1 
h) — 4x? + lla^l = — 6 x 1 X 2 — 10x1X3 — 18 x 2 X 3 + 1 

* Observe that a change of bases (47) of L defines a linear transformation of L when 
we put uf » Thus we may regard a change of basis as being inditced by a nonsingu¬ 
lar linear transformation. 
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10. Orthogonal linear transformations. The final topic of our study of 
linear spaces will be a brief introduction to those linear transformations per¬ 
mitted in what is called Euclidean geometry. Let then Vn be the n-dimen- 
sional linear space of vectors u = (ci, . . . , Cn) over a field F, We define 
the norm of u {square of the length of u) to be the value/(w) = /(ci, , , , , Cn) 
of the quadratic form. 

(52) /(rci, . . . , Xn) = a:! + . . . + 4. 

We propose to study those linear transformations S on Vn which are said 
to be length preserving and define this concept as the property that/(w) = 
f{u^) for every u of Vn- Such transformations S will be called orthogonal. 

We may define S hy — uA for an n-rowed square matrix A, Then, 
clearly, f{u) = uu\ /(u«) = u^{u^y = uAA'u\ Write B == AA' =- (6^,) 
and see that/(w^) = 'LcibaCjf f{u) = Sc?. Put Cp = 1, c,- = 0 for j p and 
have f{u) = f {u^) only if bu = 1 for every i. Then take Cp — Cq = 1, all 
other c/ = 0, from which f{u) = 2, f{u^) = 2 + 2bpq = f{u) only if 
bpq = 0 for & p 9 *^ q, B = I is the identity matrix. We call matrices A 
satisfying 

(53) AA' = I 

orthogonal matrices and have shown that S is orthogonal if and only if its 
matrix is an orthogonal matrix. 

We have seen that S determines A uniquely only in terms of a fixed 
basis of Vn- Now in Euclidean geometry the only changes of basis allowed 
are those obtained by an orthogonal linear transformation, that is, those 
for which the matrix P of (51) is orthogonal. But then PP' — I, P' 

P'P = /, so that if S is a linear transformation with orthogonal matrix A 
and we replace A by PAP'^^ = B, then 

BB' = (PAP')(PA'P') = PAAT' = I , 
and B is also orthogonal. 

EXERCISES 

1. What are the possible values of the determinant of an orthogonal matrix? 

2. Let I be the identity matrix, A be a skew matrix such that 7 + A is nonsingu¬ 
lar. Show that (7 -1- A)~K7 — A) is orthogonal. 

3. Show that 0very orthogonal two-rowed matrix A has the form 
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where a = s{s^ + h = s and i range over all quantities in F 

such that s® + 0. Hint: The result is trivial if A is a diagonal matrix. Now 

ah + Oi 2 = 1 so that s = an, t = ai 2 are of the form above, while, conversely, 
a® + 62 — (g 2 ^ = I The values 021 = ±^, 022 = T o are derived from 

AA' = L 

4. The equations for rotation of axes in a real Euclidean plane through an angle 
6 are X = xo cos h — yo sin A, 2 / = Xo sin A + 2/0 cos h. Show that the corresponding 
matrix is orthogonal and that every real orthogonal two-rowed matrix is either the 
matrix of a rotation or its product by the matrix 



of a reflection x = Xo, 2 / = "“2/o. 

5. Find a real orthogonal matrix P for each of the following symmetric matrices 
A such that PAP' is a diagonal matrix. Hint: Compute PAP' = D = (da) and 
put di 2 = 0. 

«)(;s) «)(!!) 

11. Orthogonal spaces. We shall call two vectors u and v of Vn orthog¬ 
onal (that is, in a geometric sense, perpendicular) if uv^ = 0. Then, if A is 
any orthogonal matrix and we define a linear transformation S of Vn by 

= wA, we have 

= wA , = vA , u^iv^y = uAA'v' = wv' = 0 

if and only if wt;' = 0. Thus orthogonal transformations on F» preserve 
orthogonality of vectors of Vn- 

If L = ViF + . . . + VmF is a linear subspace of V., we define the set 
0(L) to be the set of all vectors w in Vn such that vw' = 0 for every v of L. 
Then 0(L) is a linear subspace of Vn which we shall call the spcxe orthogonal 
in Vn to L. For, clearly, if w\ and are in 0(L) and a and 6 are in P, we 
have v{av)i + hw^' = avw[ + bvw^ = 0, aw^ + bw^ is*in 0(L). We now 
prove 

Theorem 9. Let Lbe a linear subspace of order m of Vn. Then the order 
of 0(L) is n — m. 

In fact we shall prove 

Theorem 10. Let L be the row space of the mby n matrix G of rank m so 
that G = (Im 0)Q for a nonsingular n-rowed matrix Q. Then 0(L) is the 
row space 0 / H = (0 In-m) (QO""^ of rank n — m. 

For proof we first note that the elements of GH' are the products Viw'j 
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of the rows Vi of G by the transposes of the rows Wj of H. But GH' = 
(/» 0)[(0 and, clearly, then GH' = 0. Let then 0(L) = yiF + 

. . . + VtF of order q over F so that 0(L) contains the row space of H. 
Evidently H has rank n — m, and we must have q> n — m. We let Ho be 
the g by n matrix whose ftth row is 2 /», and we have GH'o = 0, 0 = 
(7„ 0)QH'o. The rank of the n by g matrix 

D-QHi- (^;) 
is q since Q is nonsingular. But 

(/„ = 

SO that Di = Oj D has at most n — m nonzero rows, and D must have rank 
q < n — m. This proves that q = n — m, the row space of H is 0(L). 

Note that the row space of H is the set of all solutions x = , Xn) 

of the homogeneous linear system Gx' = 0. 

The sum of the orders of L and 0(L) is n, and it is natural to ask whether 
or not they are complementary in V„, that is, whether = L + 0(L). 
This is not true in general, since if F is the field of all complex numbers and 
L = vF, t* = — 1, t) = (1, i), then to' = 1 + = 0. Hence 0(L) is con¬ 

tained in L and 0(L) = L. However, we may prove 

Theorem 11. Let F be a field whose quantities are real numbers. Then 
V„ = L -1- 0(L). 

For by Theorem 9 it suffices to prove that the only vector w in L and 
0(L) is the zero vector. Hence let w be in both L and 0(L) so that w = dG, 
where d = (di, dm) has elements in F and (? is an m by n matrix of 
rank m whose row space is L. Then Gw' = 0 while also Gw' = GG'd'. By 
Theorem 3.17 the matrix GG' has rank m and is nonsingular, GG'd' = 0 
onlj^f d' = 0, d = 0, to = 0 as desired. 


EXERCISE 

Let L be the space over the field of all real numbers spanned by the following 
vectors Find a basis of the space 0(L) in the corresponding Vn. 


a) «i = (1, 2, -1, 0), 

b) Ui - (1, 0, 1,1), 

c) Mi = (1, -1, 2, 1) , 

d) Mi • ( 1 , 2 , -1) , 


= ( 0 , 1 , 2 , 1 ) 

Ms = (0, 1, 0, 1) , 

Ms = (2, -1, 2, 3), 
Ms = f-l, 1, 0), 


M,= (-1, 2, 1,0) 

Ms = (1, -2, 4, 0) 
Ms = (3, 3, 3) 



CHAPTER V 


POLYNOMIALS WITH MATRIC COEFFICIENTS 
1. Matrices with polynomial elements. Let F be a field and designate by 
(1) F[x] 


{read: F bracket x) the set of all polynomials in x with coefficients in F. 
We shall consider m by n matrices with elements in F[x] and define elemen¬ 
tary transformations of three types on such matrices as in Section 2.4. As 
we stated in that section, we assume that in the elementary transformations 
of type 2 the quantities c are permitted to be any quantities of F[x]. But 
those of type 3 are restricted so that the quantity a in F[x] shall have an 
inverse in F[x]. Then a 5 *^ 0 must be a constant polynomial, that is, a may 
be any nonzero quantity of F. 

We now let A and B be m by n matrices with elements in F[x] and call 
A and B equivalent in F[x] if there exists a sequence of elementary transfor¬ 
mations carrying A into B. The field F{x) of all rational functions of x with 
coefficients in F contains F[x]y and it is thus clear that if A and B are equiva¬ 
lent in F[x] they are also equivalent in F{x). Hence we see that A and B 
are equivalent in F[x] only if they have the same rank. We may then 
prove 

Lemma 1. Every nonzero matrix A o/ rank r with elements in F[x] is equiva¬ 
lent in F[x] to 


( 2 ) 



where Gi = diag {fi, . . . , f,) for monic 'polynomials = fi(x) such that di¬ 
vides fi+i. 

For the elements of all matrices equivalent in F[a:] to A are polynomials 
in x, and in the set of all such polynomials there is a nonzero polynomial 
/i = fi{^) of lowest degree. Using elementary transformations of types 3 
and 1, we may assume that/i is monic and is the element in the first row 
and column of a matrix C = (c*,) equivalent in F[x] to A, By the Division 
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Algorithm for polynomials we may write cn = g,/i + r* for g, and r, in 
F[x] and r» of degree less than the degree of/i. But if we add — g* times the 
first row of C to its ith row, we pass to a matrix equivalent in F[x] to A 
with ri as the element in its ith row and first column. Our definition of fi 
thus implies that Vi is zero. Moreover, we have now shown A equivalent 
in F[j:] to a matrix D = (dt/) with dn = 0 for i 1, dii = /i. Similarly, 
we see that every du is divisible by/i and hence that A is equivalent in 
F[x] to a matrix 



where A 2 has m — 1 rows and n — 1 columns. Then either A 2 = 0, or we 
may apply the same process to A 2 . After a finite number of such steps we 
ultimately show that A is equivalent in F[x] to a matrix (2) of our lemma 
such that every/i = fi{x) is monic and is a polynomial of least degree in 
the set of all elements of all matrices equivalent in F[x] to 

(4) ^* = ( 0 * 0 )’ G'i = diag 


Write/t +1 = fiSi + /i, where Si and h are in F[x] and the degree of ti is less 
than the degree of/». Then we add the first row of At to its second row so 
that the submatrix in the first two rows and columns of the result is 



We then add — times the first column to the second column to obtain 
a matrix equivalent in F[x] to Ai and with corresponding submatrix 


( 6 ) 


K 


-f) 


Our definition of fi thus implies that ti = 0, /» divides fi+i as described. 

We observe now that the elements of every ^-rowed minor of a matrix A 
with elements in F[x] are polynomials in x, so that these minors are also 



POLYNOMIALS WITH MATRIC COEFFICIENTS 


91 


polynomials in x. By Section 2.7 if B is obtained from A by an elementary 
transformation, then every <-rowed minor of B is either a <-rowed minor of 
Af the product of a /-rowed minor of -A by a quantity a of F, or the sum 
Ml + /M 2 where Mi and Af 2 are /-rowed minors of A and / is a polynomial 
of F[x], If d in F[x] then divides every /-rowed minor of A, it also divides 
every /-rowed minor of all m by n matrices B equivalent in F[x] to A, But 
A is also equivalent in F[x] to B and therefore the /-rowed minors of equiva¬ 
lent matrices have the same common divisors. We may state this result as 
Lemma 2. Let A be an m by n matrix with elements in F[x] and dt be the 
greatest common divisor of all t~rowed minors of A. Then dt is also the greatest 
common divisor of the t-rowed minors of every matrix B equivalent in F[x] to A. 

Every (fc + l)-rowed minor Afjfe+i of A may be expanded according to a 
row and is then a linear combination, with coefficients in F[x]f of fc-rowed 
minors of A, Hence dk divides every Mk+i so that d* divides dk^i. We ob¬ 
serve also that in (2) the only nonzero A;-rowed minors are the fc-rowed 
minors |diag {/q, ..., /q} | for ii<i 2 < ... <4 , and since clearly every/< 
divides //+i we see that the g.c.d. of all k-rowed minors of (2) is /i. . . /*. 
We thus have d* = /i. . . /jb, whence 

(7) 

Ojfc-l 

It is customary to relabel the polynomials /» and thus to write fr = gij 
/r-i = 92 , fi = Qr- We call gj the jth invariant factor of A and see that if we 
define do = 1 we have the formula 

(8) = ^ (j = l,...,r). 

Ur—j 

Moreover, gr,- is now divisible by g,+i for j = 1, . . . , r — 1. We apply ele¬ 
mentary transformations of t 3 T)e 1 to (2) and see that if A has invariant 
factors gi, , gr,, then A is equivalent in f [*] to 

(9) (o o) ’ ^ ^ • • • »• 

If, then, B is equivalent in F[a;] to i4, it has the same greatest common di¬ 
visors dk and hence the same gf,- of (8), while, if the converse holds, B is 
equivalent in F[i] to (9) and to A. We have proved 
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Theorem 1. Two mby n matrices with elements in F[x] are equivalent in 
F[x] if and only if they have the same invariant factors. Every mby n matrix 
of rank r with invariant fadors gi, . . . , gr is equivalent in F[x] to (9). 

As in our theory of the equivalence of matrices on a field we may obtain 
a theory of equivalence in F[a:] using matrix products instead of elementary 
transformations. We thus define a matrix P to be an elementary matrix if 
both P and P“‘ have elements in P[a;]. Then, if o = |P| and b = |P“M) 
the quantities a and b are in F[x], ab = | /1 = 1, so that a and b are nonzero 
quantities of F. But if P has elements in P[x] so does adj P. Hence so will 
P“‘ = IPI adj P if IPI is in P. Thus we have proved that a square ma¬ 
trix P with polynomial elements is elementary if and only if its determinant 
is a constant (that is, in F) and not zero. 

We now observe that, i^particular, the determinant of any elementary 
transformation matrix is in F. Hence, if A and B are square matrices with 
elements in F[x\ and are equivalent in F[x], their determinants differ only 
by a factor in F. Moreover, if | A | and | B \ are monic poljmomials, then 
the equivalence of A and B in P[x] implies that [A ] = jP! and, in fact, 
that when A is a nonsingular matrix with gi,. . ., g, as invariant factors its 
determinant is the product gi. . . gn. 

It is clear now that a square matrix P with elements in P[x] is elementary 
if and only if P is equivalent in P[x] to the identity matrix, and thus the 
invariant factors of P are all unity. We may now redefine equivalence. We 
call two mby n matrices A and B with elements in F[x] equivalent in F[x] if 
there exist elementary matrices P and Q such that PAQ = B. Then we again 
have the result that A and B are equivalent in F[x] if and only if they have 
the same invariant factors. For under our first definition P and Q are equiva¬ 
lent in F[x] to identity matrices and hence may be expressed as products of 
elementary transformation matrices. But, if Po and Qo are elementary trans¬ 
formation matrices, the products PoA, AQo are the matrices resulting from 
the application of the corresponding elementary transformations to A. 
Hence, PAQ must have the same invariant factors as A. The converse is 
proved similarly, and we have the result desired. 

In closing let us note a rather simple pol 3 rnomial property of invariant 
factors. The invariant factors of a matrix A with elements in F[x\ and rank 
r are certain monic polynomials gi(x) such that g,+i(x) divides g<(x) for 
i = 1,..., r — 1. If g*(x) = 1, then g,(x) = 1 for larger j = k + 

. .. , r. Let us then call those g<(x) 1 the nontrivial invariant factors 
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of A, the remaining gi{x) = 1 the trivial invariant factors of A. Thus there 
exists an integer i such that gi{x) has positive degree for i = 

gt+il^x) = . . . = gr{x) = 1 . 


EXERCISES 

1. Express the following matrices as polynomials in x whose coefficients are 
matrices with constant elements. 



/l + 2 x 

x^ + 4 x^ A-X+ 2 

X® 4 - 4 x 4 - 2 

0 ) 

( ® 

X^ + X 

x2 


\l - 2x 

x* + 3 x* — 3 a: — 1 

X® — x* -f- 4 x — 2 


/ X* 

X 

x2 — 2x \ 

b) 

x» - 1 

1 +x 

CO 

1 

1 


\*» + 2a:* 

- 2 x^ + 2 x +2 

1 

1 


/x» + X* - 

- 2x X* — X® 

X® + x\ 

c) 

( x» 

X — X® 

X® 1 


\2x* + X* 

— 2 x+l X* — X — 

1 X +1/ 


/»* + * 

X X* + 1 \ 


d) 

(x* + X 

X 4. 1 X* + X 1 



\x* - 1 

2x 2x2 4. 2/ 



2. Let A be an m by n matrix whose elements are in F[x]. Describe a process by 
means of which we may use elementary row transformations involving only the 
ith and kth rows of A to replace A by a matrix with the g.c.d. of an and a*y in its 
ith row and jih column. Hint: Use the g.c.d. process of Section 1.6 with / = a*,-, 
g = akj. 

3. Use elementary transformations to carry the following matrices into the 
form (9). 




x{x - 1) 
0 


0 

x{x + 1) 


) «)(■ 


:* + a; - 2 

0 


0 

x^ + 2x 


-3) 


4. Describe a process for reducing a matrix A with elements in F[x] to an equiva¬ 
lent diagonal matrix. How, then, may we use the process of Ex. 3 to carry this 
preliminary diagonal matrix into the form (9)? 

5. Reduce the matrices of Ex. 1 to the form (9) by the use of elementary trans¬ 
formations. 

6. Determine the invariant factors of the matrices of Ex. 1 by the use of (8). 
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7. Use elementary transformations to reduce the following matrices to the 
form (9). 


o) 


[2x* 


1 

1 

—a; 

1 


2 X — - X* 

2 + x -f X* - 2x< 


X — x* — 2 
X - 2 
1 + 2x - x2 - x3 
— 1 + X — 2x* 


b) 


d) 


e) 



I2x^ + 4x + 2 
X* + X — 3 
3x + 6 
x + 2 


— X 

5x2 2x 

— 2x2 + X 
— x2 


2x2 + 3x + 1 
x2 - 1 
3x 3 
x + 1 


x2 — X — 1 
6x2 + 5a; + 2 
-2x2 + 2x + 1 
— x2 


2. Elementary divisors. Let K be the field of all complex numbers so that 


gi{x) = (X - Ci)«i . . . (x - c,)**, 


where ci, . . . , c, are the distinct complex roots of the first invariant factor 
of a'matrix A with elements in K[x]. Since gi+i{x) divides g^i(x), it is clear 
that every gi{x) divides fifi(x), and thus 

(10) gi{x) = (x - Ci)*« . . . (x - (i = 1, . . . , r) . 

Here r is the number of invariant factors of A, « e,-, ^ 0 for i = 2, 

. . . , r. We shall call the rs polynomials 

fij = (x- CiYa 

the elementary divisors of A, Those for which e^ > 0 will be called the 
nontrivial elementary divisors of A, 
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The invariant factors of A clearly determine its elementary divisors 
uniquely as certain rs powers fij of linear functions z — c, where r is the 
rank of A, and s is the number of distinct roots of the invariant factors 
of A. Conversely, the elementary divisors of A uniquely determine its in¬ 
variant factors. Jn fact, let us consider a set of q polynomials each a power 
with positive integral exponent of a monic linear factor with a complex 
root. The distinct roots in our set may then be labeled Ci, . . . , c, and our 
polynomials have the form hkj = (x — c,)"*/ for n*, > 0. For each c, let tj be 
the number of hkj in our set, t be the maximum tj. Then, clearly, q = h + 

. . . + it ^ tSy and our set of polynomials may be extended to a set of 
exactly is polynomials by adjoining t — tj polynomials (x — with 
Uij = 0 . Let us then order the t exponents to be integers Cij satisfying 
eij ^ 62 / ^ ^ Ctj ^ 0. Define gi{x) as in (10) for i = 1 , . . . , / and 

obtain a set of polynomials gi(x) such that gi+i(x) divides gi(x) for i = 1 , 
... yt — Ij the gi{x) are the nontrivial invariant factors of a matrix A 
whose nontrivial elementary divisors are the given hkj. If A has rank r, we 
have r ty and we adjoin (r — t)s new trivial elementary divisors /*, to 
obtain the complete set of r invariant factors gi{x) of A. 

It is now evident that two m by n matrices with elements in K[x] are 
equivalent in K[x] if and only if they have the same elementary divisors. 

The matrix (9) has prescribed invariant factors and hence prescribed 
elementary divisors. However, it is desirable to obtain a matrix of a form 
exhibiting the elementary divisors explicitly. We shall do this. Let us prove 
first 

Theorem 2. Let fi, . . . , f. monic polynomials of F[x] which are rela¬ 
tively prime in pairs. Then the only nontrivial invariant factor of the matrix 
A = diag {fi, . . . , fa} is its determinant g = fi . . . f,. 

The result is trivial for s = 1. If s = 2 , the g.c.d. of the elements/i and 
/2 of A 2 = diag {/i,/ 2 l is unity, di = l,/i /2 is the only nontrivial invariant 
factor of A 2 , and A 2 is equivalent in F[x\ to diag {fif 2 , 1}. Assume, then, 
that Aa_i = diag {/i, . . . ,/a-i} is equivalent in F[x] to = diag 
1 , . . . , 1 ), where gf._i = /i . . . /,-i. Then A = diag {/i, ...,/,} is equiv¬ 
alent in F[x] to diag /„ 1, . . . , 1}. But ^._i is prime to diag 
/•} is equivalent in F[x] to diag {gr, 1}. Hence A is equivalent to 
diag {g, 1 , . . . , 1}. Then g is the only nontrivial invariant factor of A, 
and our theorem is proved. 

We see now that if gi{x) is defined by ( 10 ) for distinct complex numbers 
Cjy the corresponding elementary divisors /ii, ... y ft, are relatively prime 
in pairs. By Theorem 2 the matrix A* = diag {/ii, is equivalent 

in F[x] to diag 1, . . . , 1}. But then A = diag {Ai, . . . , At} is equiv- 
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alent in F[x] to diag . . . , 1,. . . , 1}, and hence the nontrivial in¬ 

variant factors of A are fl'i,. . . , gt- We examine the form of A to see that 
we have proved 

Theorem 3. Let Ci, ..., c, be complex numbers and fi= (x — Ci)“i/or ivr 
tegers ni ^ 0. Then the matrix 

A = diag {fi, . . . , fr) 
has the fi as its elementary divisors. 

EXERCISES 

1. The following pol 3 rnomials are the nontrivial invariant factors of a matrix. 
What are its nontrivial elementary divisors? 

a) X* + + x®, X* + X*, X* + X 

h) X* + X® + 2x® + 2x* + X® + X, + X, x 

c) x(x - 1)» (x® - 2x + 1), X® - 2x® + X, X - 1 

d) X® — 5x® + 9x® — 7x + 2, x® — 4x® + 5x — 2, x® — 3x + 2 

e) (x® - 1)®, (x® - 1)®, (x® - 1), X - 1 

/) (x^+l)\ x^+l 

2. The following polynomials are the nontrivial elementary divisors of a matrix 
whose rank is six. What are its invariant factors? 

a) (x - 1)®, (X - 1)®, (X - 1), (X + 1)®, (X + 1) 

b) (x - 2)*, (x - 2)», (x - 2), X, (x + 1)», x» 

c) (x - 3), (x - 3)’, (x - 3)‘, X*, X*, x' 

d) X, (x - 1), (x - 2), (x - 3), (x - 4), (x - 5) 

e) (x + 1)’, (x + 1)*, (x - 1), X, x\ x», x» 

/) X, (x - 1), X*, (x - 1)», x’, (x +1)« xs 

3. Find elementary transformations which carry the following matrices in{o the 
form (9). 


/(x-l)‘ 0 0 \ 

/x» 0 0 \ 

a) 0 X - 2 0 

6) ( 0 X + 1 0 1 

\ 0 0 X - 1/ 

Vo 0 X + 2 / 


/X* 0 0 \ 

c) (0 (x - 1)» 0 ) 

Vo 0 X + 1/ 
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3. Matric poljmomials. Let the elements a<, of an m by n matrix A be 
in F[x] and let s be a positive integer not less than the degree of any of the 
a*,. Then every has s as a virtual degree and we may write 

( 11 ) . aij = alfTf + + . . . + 

for a\f in F. Define A* = (aj}^) and obtain an expression of A in the form 

(12) f{x) = AqX* + . . . + a, 

for m by n matrices Ajb. Thus/(a:) is a polynomial in x of virtual degree 
s with m 6?/ n matrix coefficients Ajt and virtual leading coefficient Ao. 
Moreover, we say that/(a:) has degree s and leading coefficient Ao if Ao 5 *^ 0. 

In order to be able to multiply as well as to add our matrices we shall 
henceforth restrict our attention to n-rowed square matrices with elements 
in F[x] and thus to the set of all polynomials in x with coefficients n-rowed 
square matrices having elements in F. Let us call these polynomials n- 
rowed matric polynomials. If all the Ajt in (11) are zero matrices, the poly¬ 
nomial f{x) is the zero polynomial, and we again designate it by 0. Evident¬ 
ly we have 

Lemma 3. The degree of f(x) + g(x) is not greater than the degree of f(x) 
or g(x). 

Lemma 4. Let f(x) have degree n and leading coefficient Ao, g(x) have de¬ 
gree m and leading coefficient Bo such that AqBo 9 ^ 0. Then the degree of 
f(x)g(x) IS m + n and the leading coefficient of f(x)g(x) is AoBo. 

As in Chapter I we use Lemma 4 in the derivation of the Division Algo¬ 
rithm for matric polynomials which we state as 

Theorem 4. Let f(x) and g(x) be n-rowed matric 'polynomials of respective 
degrees s and t such that the leading coefficient of g(x) is nonsingular. Then 
there exist unique polynomials q(x), Q(x), r(x), R(x) such that r(x) and R(x) 
have virtual degree t — 1 and 

(13) f(x) = q(x)g(x) + r(x) = g(x)Q(x) + R(x). 

Moreover, if s <i, then q(x) = Q(x) — 0, while t/ s § t, then q(x) and 
Q(x) have degree s — t. 

While the proof is very much the same as that in Theorem 1.1, let us 
give it in some detail. We assume first that s t and let Ao 5 *^ 0 and Bq be 
the respective leading coefficients of/(x) and g{x). Then exists, and if 
qi(x) = A^Bq^x*^^ the polynomial/i(x) =/(x) — q\(x)g(x) has virtual de- 
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gree « — 1. This implies a finite process by means of which we begin with 
a polynomial fi{x) of virtual degree s — i ^ < and leading coefficient 
and then form /i+i(x) = fi{x) — qi+i(x)gix) of virtual degree s — i — 1 
for Qi^iix) = The process terminates when we obtain an 

fjix) of virtual degree < — 1. Thus we have the first equation of (13) with 
q{x) = 0, r(x) = f(x) H s < t, and otherwise with q(x) of degree s — t and 
leading coefficient and r(x) the//(x) above. If also/(x) = qo{x)g(x) + 
ro{x) for qo(x) — q{x) ^ 0, the degree of this polynomial is /i ^ 0, and its 
leading coefficient is Co 0. Also CoRo ^ 0 since Bo is nonsingular. But 
then by Lemma 4 the degree of [g'oCic) — q{x)\g{x) = r{x) — ro(x) ist + h, 
whereas its virtual degree is f — 1, a contradiction. This proves the unique¬ 
ness of g{x) and r(x). The existence and uniqueness of Q(x) and R(x) is 
proved in exactly the same way except that we begin by forming f{x) — 
gf(x)B5‘lAox•“^ 

Let us regard the first equation of (13) as the right division of f{x) by 
g{x)f the second as the left division of f{x) by g(x). Then we shall speak 
correspondingly of q{x) and r(x) as right quotient and remainder, of Q(x) 
and R(x) as left quotient and remainder. If r{x) = 0, we have f{x) = 
q{x)g{x), and g{x) is a right divisor of /(x). Similarly, we call g{x) a left 
divisor of g{x) if/(x) = g(x)Q(x), so that R(x) = 0 in (13). 

It is natural now to try to prove a Remainder Theorem for matric poly¬ 
nomials. However, the theorem in usual form is ambiguous since, for ex¬ 
ample, if C is an n-rowed square matrix and/(x) = Aox^ = xMo == xAox, 
the polynomial/(C) might mean any one of AoC^, C*Ao, CAqC, and these 
matrices might all be different. Thus we must first define what we shall 
mean by/(C). We shall do this and obtain a Remainder Theorem which we 
state as 

Theorem 6. Let f(x) he an n-rowed square mairic polynomial (12), C be 
an n-rowed square matrix with elements in F. Define fR(C) {read: f right of C), 
and thCC) (read: f left of C) hy 

(14) fR(C) = AoC- + AiC*-i + . . .+A^iC+A., 
and 

(15) fL(C) = C*Ao +C-^Ax + . . . +CA_i+ A.. 

Then the right and left remainders on division of f(x) 5^/ xl — C are fR(C) 
and IlCC), respectively. 
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To make our proof as in the case of polynomials with coefficients in a 
field we use Theorem 4 with g{x) = xl — C and have f{x) = q{x)g{x) + 
r{x) where q{x) has degree s — 1 and r(x) == B has elements in F, We then 
wish to draw our conclusion from/R(C) = ^^a(C)(C — C) + J5. This state¬ 
ment is correct, but let us examine it more closely. We write q{x) = 
+ . . . + and have 

h{x) = q{x){xl - C) 

= (Cox* + + • • . + — (CoCx*“^ + • . • + . 


Then, if D is any n-rowed square matrix with elements in F, we have 
hniP) = Co2> + (Cl -- CoC)Z>-i + . . . + (C._i - C.^^C)D - C.-iC, 
while 

qR{D){pi - C) = CoD* + {CxJy-^ ~ Co2>“^C) + . . . 

+ (C.^i2) - C.^2Z>C) - C.-iC, 


and these matrices are equal in general* if and only if D and C are commu¬ 
tative. They are equal if D = C, and thus /(x) = ft(x) + B implies that 
/«(C) = hR(C) + B = ?ft(C)(C — C) + B — B, The second part of our 
theorem is proved similarly. 

As a consequence of the result just proved we have the Factor Theorem 
for matric polynomials which we state as 

Theorem 6. The matric polynomial f(x) has xl — C as a right divisor if 
and only if f/zCC) = 0] it has xl — Casa left divisor if and only if fiiC) = 0. 

For Theorem 5 implies that, if /b(C) = 0, then in (12) the polynomial 
r(x) = 0, /(x) = q{x){xl — C), Conversely, if /(x) = q(x)(xl — C), we 
have seen that fR[C) = qR{C){CI — C) = 0. The results on the left follow 
similarly. 

Our principal use of the result above is precisely what is usually called 
the trivial part of Theorem 5, that is, if /(x) has x — C as a factor, then 
C is a root of /(x). However, it is nontrivial that fR^C) = 0 and follows 
only from the study above where we showed that if D is any square matrix 
such that DC = CD, then h{x) = q{x){xl — C) implies that A«(D) = 
qR{D){D - C). 

♦ E.g., if q{x) = X so that h{x) = x* — Cx, then h^{p) =!>*-- CD, g*(D)(D — C) = 
D* - DC ^ - CD unless DC * CD. 
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EXERCISES 

1. Express the following matrices as polynomials/(x) and g{x) with matric coeffi¬ 


cients and compute 9 (x), Q(x), r(x), jR(x) of (13). 

•'/ 



d) Six) = 


2 . Use/(x) of Ex. 1(c) and find/ r(C) and/£,(C) by the use of the division process 
as well as by substitution if 


a) C 


f «■) 

\2 0 0 / 


6 ) C = 


0 2 O' 
1 0 0 
,0 0 1 


/I 2 
c) C= 2 -1 
\0 0 


4. The characteristic matrix and function. If fix) is a matric polynomial 
(12) with n-rowed scalar matrix coefficients A*, then we shall call fix) a 
scalar polynomial. Thus A* = o*/ for the o* in F, and 

(16) fix) = (oox* -J- ... -f a,)I, 

where I is the n-rowed identity matrix. We call fix) monic if ao — 1. If 
now gix) is also a scalar polynomial, the quotients g(x) and Q(x) in (13) 
are the same scalar polynomials and also r(x) = R(x) is scalar. For obvi¬ 
ously this case of Theorem 4 is now the result of multiplying all the poly¬ 
nomials of Theorem 1.1 by the n-rowed identity matrix. 
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If A is any n-rowed square matrix, the polynomials/ r(A) and/j[,(A) are 
equal for every scalar polynomial /(x), and we shall designate their com¬ 
mon value by 

(17) ^ f{A) = UoA* + . . . + + aj . 

We now say that either the polynomial/(x) or the equation/(x) = 0 has A 
as a root if f{A) = 0. By Theorem 6 the matrix A is a root of /(x) if and 
only if /(x) has xZ — A as either a right- or a left-hand factor. 

We shall call the matrix xl — A the characteristic matrix of A, the de¬ 
terminant |x/ — A| the characteristic determinant of A, the scalar poly¬ 
nomial 

(18) Six) ^ \xl - A\ > I 

the characteristic function of A, and the corresponding equation/(x) = 0 
the characteristic equation of A. We now apply (3.27) to xZ — A to obtain 

(19) (xZ ~ A)[adj (xZ - A)] = [adj (xZ ~ A)](xZ - A) 

- |xZ-A| .Z. 

Then the elements of adj (xZ — A) are the cofactors of the elements of 
xZ — A, and adj (xZ — A) is a matric polynomial (in general, nonscalar). 
By the argument above we have 

Theorem 7. Every square matrix is a root of its characteristic equation. 
The g.c.d. of the elements of adj (xl — A) is clearly the polynomial dn-\{x) 
defined for xZ—A and dn ( 3 ;)= |xZ—A|. Butthenadj (xZ—A) =dn~i(3:)B(x), 
where B{x) is an n-rowed square matrix with elements in F[x]. Hence B{x) 
is a matric polynomial. The invariant factors of xZ — A were defined in (8), 
and by (19) [xZ — A|Z = gi{x)dn-\{x)I = (xZ — A)B{x)dn-^\{x), By the 
uniqueness of quotient in Theorem 4 we have 

(20) g{x) = gi{x)I = (xZ - A)B(x) 

Hence, clearly, g{A) = 0. Observe also that the g.c.d. of the elements of 
B{x) is unity so that if B{x) = Q{x)q{x) for a monic scalar polynomial 
q{x)f then q{x) = 1. 

We now define the minimum function of a square matrix A to be the 
monic scalar polynomial of least degree with A as a root. The remark just 
made above then implies 

Theorem 8. Let gi(x) be the first invariant factor of the characteristic ma¬ 
trix of A. JThen g(x) = gi(x)I is the minimum function of A. 
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For if h{x) is the minimum function of A we may write g{x) = h{x)q{x) + 
r(x) for scalar polynomials h(x) and r(x) such that the degree of r{x) is less 
than that of/i(x). Buth(A) = 0, andj;(A) = Osothatr(A) = 0, r(x) = 0. 
Hence g{x) = h{x)q{x), and since g(x) and h(x) are monic so is q{x). By 
Theorem 6 we have h{x) = {xl — A)Q{x), and by (20) we have g{x) = 
{xl — A)Qix)q{x) = {xl — A)B(x). The uniqueness in Theorem 4 then 
states that B{x) = Q(x)q{x), q(x) = 1, from which g{x) = h{x) as desired. 

We see now that \xl — A \ is a monic polsmomial of degree n and is not 
zero, r = ffi = n in (9), and 

(21) P{x) • {xl - A) • Q(a:) = diag {gri,. . . , ff«} 

for elementary matrices P{x) and Q(x). But then, as we have already ob¬ 
served in an earlier discussion, 

(22) c • |x/ - A| • d = gfi. . . gn 

for c = \P(x) I and d = lQ(x) | in F. Hence cd = 1, and we have proved 

Theorem 9, The characteristic function of a square matrix A is the product 
by I of the product of the invariant factors of the characteristic matrix of A. 

This result implies that |x/ — A| is the product of gi{x) by divisors 
gi{x) of gi{x). It follows that every root of \xl — A \ in any field K con¬ 
taining f is a root of ^i(x). But in fact we have already seen that if F is 

the field of all complex numbers, the elementary divisors of xl — A are 
polynomials (x — c,)**? whose product is \xl — A\, Then the c, are the 
distinct roots of gfi(x) as well as of \xl — A |. They are called the charac¬ 
teristic roots of A, 

In closing this section we note that if we write /(x) = |x/ — A | • / = 
(x’* + aix"“^ + . . . + fln) /, then/(0) = | —A | • / = ( —1)»»|A | • /, so that 
[Aj = (~l)’‘an. It follows that, if [Aj 0, then 

(23) A"^ = -anHA""^ + + . . . + an-i • I ), 

and hence it is obvious that AA~^ = A~*A. Moreover, if |A| =0, then 
A”^ does not exist. If, then, A 7 ^ 0 and gi{x) — x^ + 6ix"*~^ + . . . + 6m, 
the polynomial g{x) = g\{x)l can be the minimum function of A only if 
6m = 0. Since A 0, we have m > 1, and 0 = A*””^ + 6 iA '"~2 _ -j- 
bm^J is a nonzero matrix with the property 


(24) 


AG = GA = 0. 
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EXERCISES 

1. When is the minimum function of a matrix linear? 

2. What, then, are the minimum functions of the following matrices? 

«)(?4) «>(n) 

3. Let A = diag {Ai, A^] where A\ and A^ are square matrices of m and n rows 
respectively. Show that if f{x) is any polynomial in F[x] and we define ft{x) = 
S{x)It for any t, then/m+n(A) = diag {/,n(Ai),/n(A 2 )}. Hint: Prove first by induc¬ 
tion that A* = diag {Ai**, A*}. 

4. Let A have the form of Ex. 3 and let gf(x)7ni+n, and g 2 {x)ln be the 

respective minimum functions of A, Ai, A 2 . Prove that g{x) is the least common 
multiple of gi{x) and g 2 {x), 

5. Apply Ex. 4 in the case where Ai is nonsingular and A 2 = 0. 

6. Compute the characteristic functions of the following matrices. 

/I 2 3\ /2 3 1\ 

a) 2 -1 4 ) 6) 1 2 1 ) 

\3 11 / Vo 0 - 4 / 


7. It may be shown that the characteristic function f{x) = x” — aix'"”^ -h 
. . . + ( —l)*aix"‘“‘ + . . . + (—l)”an of an n-rowed square matrix A has the 
property that ai is the sum of all i-rowed principal minors of A. Verify this for the 
matrices of Ex. 6. 

6. Similarity of square matrices. We have defined two n-rowed square 
matrices A and B with elements in a field F to be similar in F if there exists 
a nonsingular matrix P with elements in F such that' 

PAP-i = B . 


c) 


0-1 2 -3\ 

103-2' 

-2-3 0 1 

3 2-1 01 



The principal result on the similarity of square matrices is then given by 
Theorem 10. Two matrices are similar in F if and only if their charac¬ 
teristic matrices have the same invariant factbrs. 

For if PAP-i = P, then P(x/ - A)P-' = xP/P-^ - B ^ xl - B. 
But P has elements in P, | P | 0 is in P, P is elementary. Hence xl — A 
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and xl — B are equivalent in F[x] and have the same invariant factors. 
Conversely, let xl ^ A and xl — B have the same invariant factors, so that 

(25) P(x)[xl - AMx) ^xl - B 
for elementary matrices P{x) and Q{x), We define 

(26) P = Pl(B) , Q = Qr(B) 
as in Theorem 5 and have 

(27) P(x) = (xl -- B)Po(x) + P , Qix) = Qo{x)ixI - B) + Q . 

Then 

Pix)[{xl - A)]Q(x) = (xl ^ B)Po(xKxI--A)Q(x) + 

P-(xI -A) • Qo(x)(xI - B) + P ^ (xl - A) • Q = xl - B. 

We now use (25) and the fact that P^x) and Q(x) are elementary to write 
[P(a;)]“^ = C(x), Q(aj)“^ = D(x) for matrix poljoiomials C{x) and D{x) 
such that 

(28) (xl - A)Q(x) = C(x)(xl - B ), P(x)(xl - A) = (xl - B)D{x ). 
But then from (27) and (28) 

P-{xI -A) {xl - B)[D{x) - Po{xXxI - A)] 

and thus 

(29) {xl - B) - P -{xl - A)^Q=^ {xl - B)R{xI - B) , 

where R = R{x) = Po(a:)C(x) + D{x)Qo{x) — Po(x)(a;J —A)Qo(x). By Lem¬ 
ma 4 the degree in x of the right member of (29) is at least two unless 
ft(x) = 0. But the degree in x of the left member of (29) is at most one, 
R{x) = 0, 

(30) P • (x/ - A) . Q = xJ - P. 

It follows that PAQ « B, PIQ — I, Q = P“^ PAP^^ = B as desired. 

Observe that the degree of | xZ — A | is n and hence that if Ui is the de¬ 
gree of the ith nontrivial invariant factor gr<(x) the property |xZ — A | = 
gi{x) .. . gt{x) implies that 

(31) ni + n 2 -f . . . + n< = n, ni ^ n 2 ^ ^ ne = 0 . 

Obviously, this is an important restriction on the possible degrees of the in¬ 
variant factors of the characteristic matrix of an n-rowed square matrix. 
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EXERCISES 

1. What are all possible types of invariant factors of the characteristic matrices 
of square matrices having 1, 2, 3, or 4 rows? 

2. Give the possible elementary divisors of such matrices. 

3. Use the proof of Theorem 10 to show that if Ai, A 2 , Bi, and B 2 are n-rowed 

square matrices such that Ai and Bi are nonsingular, then Aix + A 2 and Bix + B 2 
are equivalent in F[x] if and only if there exist nonsingular matrices P and Q with 
elements in F such that PAiQ = Bi and PA 2 Q = B 2 . Hint: Take A = —A 71 A 2 , 
B = in (25). 

4. Show that the hypothesis that Ai is nonsingular in Ex. 3 is essential by proving 
that A\x — I and BiX — I are equivalent in F[x\ yet P and Q do not exist, if 

/O 1 0\ /O 1 0\ 

A,= 0 1 0 , 0 0 1 . 

Vo 00/ \o 0 1/ 


6 . Characteristic matrices with prescribed invariant factors. If gi{x), 
• • • i Qt{x) are the nontrivial invariant factors of a matrix xI—A and n< is 
the degree of gi{x)y then by Theorem 1 the n-rowed square matrix 

B = diag {Pi, . . . , Pe) 


will be similar in F to A if P» is an n-rowed square matrix such that the 
only nontrivial invariant factor of the characteristic matrix of Bi is gi(x). 
For then xl — B diag {xln^ — Pi, ... , xlm — Bt] is equivalent in 

F[x] to diag {(/i, where (?< = diag 1, . . . , 1}, and we con¬ 

clude that xl — B has the same invariant factors as x/—A. Thus the problem 
of constructing an n-rowed square matrix A whose characteristic matrix has 
prescribed invariant factors is completely solved by the result we state as 
Theorem 9. Let g(x) = x“ — (bix*^-^ + . . . + bn) and 


(32) 


A = 


0 1 0 
0 0 1 


0 0 0 
bn bn—1 bn—2 


0 

0 

1 

bi 


Then g(x)I is both the characteristic and the minimum function of A, g(x) is 
the only nontrivial invariant factor of xl — A. 

For 


X 

-1 

0 

0 

0 

0 

X 

-1 

0 

.0 

0 

0 

0 

X 

-1 

, -bn 


— 6n-2 . . . 

-62 

X — bi , 


(33) xl - A 
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The complementary submatrix of the element — of (33) is an (n — 1)- 
rowed square triangular matrix with diagonal elements all — 1 and its de¬ 
terminant is (—1)"“"^ Thus the cofactor of —&» is (—= 1 
and hence dn-i(a?) = 1. It remains to prove that dn (a?) = |a;J — A| = g{x). 
This is true if n = 1, since then A = Ai = (6i), and \xl — A\ = x — 6i. 
Let it be true for matrices An-i of the form (33) and of n — 1 rows, so that 
\xl — An~i| = ~ -f . . . + 6n-i) is the cofactor of the element 

X in the first row and column of (33). We now expand (33) according to its 
first column and obtain |x/ — A | = xlx'^-^ — (ftix’*"'* + . . . + &n-i)] — 
bn = g{x) as desired. This proves our theorem. 

The construction of square matrices A with complex number elements 
whose characteristic matrices have prescribed elementary divisors has a 
simple solution, and we shall see that the argument preceding Theorem 9 
reduces the solution to the proof of 

Theorem 10, Let c he a complex number^ A be the n-rowed square matrix 


c 

1 

0 ... 

0 

0 

0 

c 

1 ... 

0 

0 

0 

0 

0 ... 

c 

1 

1 0 

0 

0 ... 

0 

c , 


Then the only nontrivial invariant factor o/ xl — A is (x — c)“. 

For x/ — A is a triangular matrix with diagonal elements all x — c, 
|x/ — A| = (x — c)**. The complementary minor of the element in the 
nth row and first column of x/ — A is a triangular matrix with diagonal 
elements all unity, dn-i(ic) = 1, and dn{x) = (x — cY is the only nontrivial 
invariant factor of A. 

Thus if Cl,... , ct are complex numbers and ni,. . . , ni are positive in¬ 
tegers, we construct matrices A,- of the form (34) for c = cy and with n, 
rows. The matrix A = diag {Ai, . .. , At} then has n rows, and its char¬ 
acteristic matrix xZ — A = diag {Bi,. . . , St}, where Bi = xl^ — A* is 
equivalent in F[x] to diag {/»•, such that fi — (x dY^ But 

then xZ — A is equivalent in F[x] to diag j/i, . . . , /t, 1, . . ., 1}, and by 
Theorem 3 the nontrivial elementary divisors of xZ — A are /i,.. . , /t as 
desired. 
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EXERCISES 

1 . Compute the invariant factors and elementary divisors of the characteristic 
matrices of the following matrices. 



2. Find a matrix B = diag {Ri, . . . , B/j similar in F to each of the matrices of 
Ex. 1, respectively, where Bi has the form (32), and the characteristic function of Bi 
is the ith nontrivial invariant factor of A, 

3. Solve Ex. 2 with the characteristic function of Bi now the ith nontrivial ele¬ 
mentary divisor of Bioi the form (34). 


7. Additional topics. There are many important topics of the theory of 
matrices other than those we have discussed, and we leave their exposition 
to more advanced texts. Let us mention some of these topics here, however. 
The quantities of the field K of all complex numbers have the form 

c = a + (a, h in = — 1) , 

where R is the field of all real numbers, a subfield of K, The complex con¬ 
jugate of c is 

c = a — bi, 


and the correspondence c <—> c defines a self-equivalence or automorphism 
of K, a fact verified in Section 6.9. This automorphism of K leaves the 
elements of its subfield R unaltered, that is, d = a for every real a. We 
then may call it an automorphism over R, 
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If A is any m by n matrix with elements a*,- in if, we define A to be the 
m by n matrix whose clement in its ith row and jth column is a,/. It is then 
a simple matter to verify that 

ab ^ IB, (ly = JT, , 

for every A, B, and nonsingular C. Moreover, then (IBY = WI^, We 
now define a matrix A to be Herniitian if iP = A, skew-Hermitian if I^ = 
—A. Two matrices A and B are said to be conjunctive in K if there exists a 
nonsingular matrix P with elements in K such that 

PAP' = B . 

The results of this theory are almost exactly the same as those of the theory 
of congruent matrices, and it is, in fact, possible to obtain a general theory 
including both of the theories above as special cases. 

Two symmetric matrices A and B with elements in a field F are said to 
be orthogonally equivalent in F if there exists an orthogonal matrix P with 
elements in F such that PAP' = B, But PP' = P'P = / so that B = 
PAP~^ is both similar and congruent to A. Analogously, we call a matrix P 
with complex elements such that PT^ = = /, a unitary matrix. Then 

we say that two Hermitian matrices A and B are unitary equivalent if 
PAP^ = B, where P is a unitary matrix. Both of these concepts may also 
be shown to be special cases of a more general concept. 

Finally, let us mention the topic of the equivalence and congruence of 
pairs of matrices. Let A, B, C, D be matrices of the same numbers of rows 
and columns. Then we call the pairs A, B and C, B equivalent* pairs if 
there exist nonsingular square matrices P and Q such that simultaneously 
PAQ = C and PBQ = D. Similarly, if A, P, C, D arc n-rowed square 
matrices, we call A, B and C, D congruent pairs if simultaneously PAP' = 
C and PBP' = D for a nonsingular matrix P. 

References to treatments of the topics mentioned above as well as others 
will be found in the final bibliographical section of Chapter VI. We shall 
not state any of the results here. 

* In this connection see Exs. 3 and 4 of Section 5. 
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FUNDAMENTAL CONCEPTS 

1. Groups. These pages were written in order to bridge the gap between 
the intuitive function-theoretic study of algebra, as presented in the usual 
course on the theory of equations, and the abstract approach of the author’s 
Modern Higher Algebra, The-objective of our exposition has now been at¬ 
tained. For our study of matrices with constant elements led us naturally 
to introduce the concepts of field, linear space, correspondence, and equiva¬ 
lence, and we are ready now to begin the study of abstract algebra. How¬ 
ever, we believe it desirable to precede the serious study of mater al such 
as that of the first two chapters of the Modern Higher Algebra by a brief 
discussion of this subject matter, without proofs (or exercises). We shall 
give this discussion here and shall therewith not only leave our readers with 
an acquaintance with the basic concepts of algebraic theory but with a 
knowledge of how these concepts may lead into those branches of mathe¬ 
matics called the Theory of Numbers and the Theory of Algebraic Numbers, 

Our first new concept is that of a set G of elements closed with respect 
to a single operation, and we wish, to define the concept that G forms a 
group with respect to this operation. It should be clear that if we do not 
state the nature either of the elements of G or of the operation, it will not 
matter if we indicate the operation as multiplication. If we wish later to 
consider special sets of elements with specified operations we shall then re¬ 
place ^‘product” in ouf definition by the operation desired. Thus we shall 
make the 

Definition. A set G of elements a, b, c, . . . is said to form a group with 
respect to multiplication if for every a, b, c o/ G 

I. The product ab is in G; 

II. The associative law a(bc) = (ab)c holds; 

III. There exist solviions x and y in G of the equations 

ax = b , ya = b . 

The reader is already familiar with the groups (with respect to ordinary 
multiplication) of all nonzero rational numbers, all nonzero real numbers, 
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and, indeed, all nonzero elements of any field. These are all examples of 
groups G such that for every a and 6 of G we have 

IV. The products ab = ba . 

Such groups are called commutative or abelian groups. An example of a 
nonabelian {noncommutative) group is the group, with respect to matrix mul¬ 
tiplication, of all nonsingular n-rowed square matrices with elements in a 
field. 

Every group G contains a unique element e called its identity element, 
such that for every a of G 

(1) ae — ea a, 

Moreover, every element a of G has a unique inverse er^ in G such that 

(2) aa~^ = a~~^a = e . 

Then the solutions of the equations of Axiom III are the unique elements 

(3) X = , y = 6a-^. 

A set J? of elements of a group G is called a subgroup of G if the product 
of any two elements of H is in H, H contains the identity element of G 
and the inverse element hr^ of every h of H, Then H forms a group with 
respect to the same operation as does G. 

The equivalence of two groups is defined as an instance of the general 
definition of equivalence which we gave in Section 4.5. The concept of 
equivalence of two mathematical systems of the same kind as well as the 
concept of subsystem (e.g., subgroup of a group, subfield of a field, linear 
subspace over F of a linear space over F) are two concepts of evident funda¬ 
mental importance which are given in algebraic theory whenever any new 
mathematical system is defined. 

The number of elements in a group G is called its order. This number is 
either infinity, and we call G an infinite group; or it is a finite number n, and 
we call G a finite group of order n. For finite groups we have the important 
result which we shall state without proof.* 

Lemma 1. Let K be a subgroup of a finite group G. Then the order of H 
divides the order of G. 

A simple example of a finite abelian group is the set of all nth roots of 
unity. The reader may verify that an example of a finite nonabelian group 

* The proof is given in Chapter VI of my Modern Higher Algebra, 
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is given by the quaternion group of order 8 whose elements are the two- 
rowed matrices with complex elements, 



I, A = ( 


(4) 


0 —i/ ’ \1 0/ * 


[AB, -I, 

—A , —JB , —AB. 

The set of all powers 


(5) 

0 ® = e, 

a, a“"^, a^y a”^, . . . 


of an element a of a group G forms a subgroup of G which we shall desig¬ 
nate by 

(6) [a] 

and shall call the cyclic group generated by a. Its order is called the order of 
the element a and it can be shown that either all the powers (5) are distinct 
and a has infinite order, or a has finite order m, and [o] consists of the m 
distinct powers 

(7) e, a, a®, ... , o”*-‘, 

where e is the identity element of [a], = e. Then the order m 0 / a is the 

least integer t such that = e. Moreover, it can be shown that — e if 
and only if m divides t. 

The order m of an element of a finite group G divides the order of (?, 
since m is the order of the subgroup [a], and we may apply Lemma 1. Thus 
n = mq, a^ = {a”^y = = e. We therefore have 

Lemma 2. Let e be the identity element of a group G of order n. Then 

(8) a“ = e 
for every a of G. 

2. Additive groups. In any field and in the set of all n-rowed square mat¬ 
rices there are two operations. Thus we have said that the set of all nonzero 
elements of a field and the set of all nonsingular n-rowed square matrices 
form multiplicative groups. But the elements of any field, the set of all 
m by n matrices with elements in a field, the elements of any linear space, 
all form additive groups, that is, groups with respect to the operation of 
addition as defined for each of these mathematical systems. The reader will 
observe that the axioms for an additive abelian group G are those axioms 
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for addition which we gave in Section 3.12 for a field. Additive groups are 
normally assumed to be abelian, that is, the use of addition to designate the 
operation with respect to which a group G is defined is usually taken to 
connote the fact that G is abelian. 

The identity element of an additive group is usually called its zero 
element, that is, the element 0 such that a + 0 = a = 0 + a. The in¬ 
verse with respect to addition of a is designated as — o and is such that 
aH-(—a) = ( —a) + a = 0. Thus the solutions of the additive formula¬ 
tion 

(9) a + X = b , y + a = b y 
of the equations of our group Axiom III are 

(10) x = (-a) + by y = b+(-a). 

When G is abelian, we have x — y and designate their common value by 
b -- a. Thus we define the operation of subtraction in terms of that of addi¬ 
tion. 

In a cyclic additive group [a] the elements are always designated by 

(11) 0, o, —a, 2 • a, — (2 • a), . . . , 

where, clearly, if m is any positive integer —{m • a) = m • ( —o), and we 
define (—m) • a = — (m • a). Here m • a does not mean the product of a 
by the positive integer m but means the sum a + . . . + a with m sum¬ 
mands. If [a] is a finite group of order m, the elements of [a] are 0, a, 2a, 
. . . , (m — 1) • a, and m is least positive integer such that the sum of m 
summands all equal to a is zero. However, if [a] is infinite, then it may be 
seen that n • a ^ q • alor any integers n and q if and only if n and q are 
equal. 

3. Rings. The set consisting of all n-rowed square matrices with elements 
in a field F is an instance of certain type of mathematical system called a 
ring. Many other systems which are known to the reader are rings and we 
shall make the 

Definition. A ring is an additive abelian group of at least two distinct 
elements such that, for every a, b, c, of R, 

I. The product ab is in R; 

II. The associative law a(bc) = (ab)c holds; 

III. The distributive laws a(b + c) = ab + ac, (b + c)a = ba + ca hold. 
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We leave to the reader the explicit formulation of the definitions of sub¬ 
ring and equivalence of rings. They may be found in the first chapter of the 
Modern Higher Algebra, The reader should also verify that all nonmodular 
fields are rings, the set of all ordinary integers is a ring. 

The zero element of a ring R is its identity element with respect to addi¬ 
tion. Observe that by making the hypothesis that R contains at least two 
elements we exclude the mathematical system consisting of zero alone from 
the systems we have called rings. 

Rings may now be seen to be mathematical systems R whose elements 
have all the ordinary properties of numbers except that possibly the prod¬ 
ucts ab and ba might be different elements of /i, the equations ax = b of 
ya = b might not have solutions in 72 if o 0, 6 are in R, A ring may also 
contain divisors of zero, that is, elements a 0, c 0 such that oc = 0. 
In particular, the ring of all n-rowed square matrices has already been seen 
to have such elements as well as the other properties just mentioned. 

A ring is said to possess a unity element e if e in 72 has the property 
ea — ae = a for every a of 72. The element e then has the properties of the 
ordinary number 1 and is usually designated by that symbol. The unity 
element of the set of all n-rowed square matrices is the n-rowed identity 
matrix, and the unity element was always the number 1 in the other rings 
we have studied. However, the set of all two-rowed square matrices of the 
form 

(2 o) 

with r rational may easily be seen to be a ring without a unity element. In 
fact, all nonzero elements of this ring are divisors of zero. 

A ring 72 is said to be commutative if ab = ba for every a and b of 72. The 
ring of all integers is a commutative ring, the ring of all n-rowed square mat¬ 
rices with elements in a field F is a noncommutative. ring. 

4. Abstract fields. If 72 is any ring, we shall designate by 

72* 

the set of all nonzero elements of 72. Then we shall call 72 a division ring if 
72* is a multiplicative group. This occurs clearly if and only if the equations 
ax = b, ya — b have solutions in 72* for every a and h of 72*. The set of all 
n-rowed square matrices is not a division ring. However, let c and d range 
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over all complex numbers, d and d be the complex conjugate of c and d. 
Then the set Q of all two-rowed matrices 

^ - (}'t) 

is a noncommutative division ring. The reader should verify this, noting 
in particular that every A 5 ^ 0 is nonsingular since A 3 ^ 0 implies that 
c 9 ^ 0 OT d 9 ^ 0 and | A | = cc + dJ > 0 . The ring Q is a linear space of 
order 4 over the field of all real numbers and it has the matrices 7, A, JS, AB 
of (4) as a basis over that field. It is usually called the ring of real qualer- 
nions. 

Until the present we have restricted the term 'Afield” to mean a field con¬ 
taining the field of all rational numbers. We now define fields in general. 

Definition. A field is a ring F such that F* is a multiplicalive abelian 
group. 

The identity element of the multiplicative group F* is then the unity 
element 1 of F, The whole set F is an additive group with identity element 
0, and 1 generates an additive cyclic subgroup [ 1 ]. If this cyclic group has 
infinite order, it may be shown to be equivalent to the set of all ordinary 
integers. But F is closed with respect to rational operations, and [ 1 ] then 
generates a subfield of F equivalent to the field of all rational numbers. We 
call all such fields nonmodular fields. 

The group [ 1 ] might, however, be a finite group. Its elements are then 
the sums 

(14) 0 . 1 = 0 , 1 , 2, . . . , p - 1 , 

where p is the order of this group, and we have the property that the sum 
1 + 1 + . . . + 1 with p summands is zero. It is easy to show that p is a 
prime, and it follows that if a is in F, then the sum a + a + . . . + a with 
p summands is equal to the product (1 + 1 + . . . + l)a == 0. We call 
such fields F modular fields of characteristic p. It may easily be shown that 
the characteristic of all subfields of a field F is the same as that of F, and, 
in fact, every subfield of F contains the subfield generated by the unity ele¬ 
ment of F under rational operations. 

6 . Integral domains. A commutative ring with a unity element and with¬ 
out divisors of zero is called an integral domain. Any field forms a somewhat 
trivial example of an integral domain. Less trivial examples are the set 
F[x] of all polynomials in x, the set F[xi, . . , , xj of all polynomials in 
Xi,..,, Xq and coefficients (in both cases) in a field F, the set of all ordinary 
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integers. The set of all integers may be extended to the field of all rational 
numbers by adjoining quotients a/6, 6 5^ 0. In a similar fashion we adjoin 
quotients a/6 for a and 6 0 in J to imbed any integral domain J in a field. 

When we studied polynomials in Chapter I we studied questions about 
divisibility and greatest common divisor. We may also study such ques¬ 
tions about arbitrary integral domains. Let us then formulate such a study. 

Let J be an integral domain and a and 6 be in J. Then we say that a is 
divisible by b (or that b divides a) if there exists an element c in J such that 
a = 6c. If u in J divides its unity element 1, then u is called a unit of J. 
Thus u is a unit of J if it has an inverse in J. The inverse of a unit is clearly 
also a unit. 

Two quantities 6 and 60 are said to be associated if each divides the other. 
Then 6 and 60 are associated if and only if 60 = bu for a unit u. Moreover, 
if 6 divides a so does every associate of 6. Every unit of J divides every a 
of J. Thus we are led to one of the most important problems about an inte¬ 
gral domain, that of determining its units. 

A quantity p of an integral domain J is called a prime or irreducible quan¬ 
tity of J if p 0 is not a unit of J and the only divisors of p in J are units and 
associates of p. Every associate of a prime is a prime. A composite quantity 
of J is an a F^ 0 which is neither a prime nor a unit of J. It is natural then 
to ask whether or not every composite a of J may be written in the form 

(15) a = Pi. . . pr 

for a finite number of primes p< of J. We may also ask if it is true that 
whenever also a = . . . g, for primes then necessarily 5 = r and the qi 

are associates of the p,- in some order. When these properties hold we may 
call J a unique factorization integral domain. The reader is familiar with the 
fact that the set of all ordinary integers is such an integral domain. This 
fact, as well as the corresponding property for the set of all poljmomials in 
Xi, , , , y Xn with coefficients in a field F are derived in Chapter II of the 
authorns Modern Higher Algebra, 

The problem of determining a g.c.d. (greatest common divisor) of two 
elements a and 6 of a unique factorization domain is solvable in terms of the 
factorization of a and 6. However, we saw that in the case of the set F[x] 
the g.c.d. may be found by a Euclidean process. Let us then formulate the 
problem regarding g.c.d.’s. We define a g.c.d. of two elements a and 6 not 
both zero of J to be a common divisor d of a and 6 such that every common 
divisor of a and 6 divides d. Then all g.c.d.^s of a and 6 are associates. We 
call a and,b relatively prime if their only common divisors are units of J, 
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that is, they have the unity element of / as a g.c.d. We now see that one of 
the questions regarding an integral domain J is the question as to whether 
every a and 6 of J have a g.c.d. Moreover, we must ask whether there exist 
X and ymJ such that 

d = ox + 

is a common divisor and hence a g.c.d. of a and h; and finally whether or not 
d may be found by the use of a Euclidean Division Algorithm. 

6. Ideals and residue class rings. A subset M of a ring R is called an 
ideal ofRitM contains g — ag, ga, for every g and h of M and a of R. 
Then M either consists of zero alone and will be called the zero ideal of i2, 
or M may be seen to be a subring of R with the property that the products 
am, ma of any element mof M and any element aof R are in M. 

If H is any set of elements of a ring R, we may designate hy {H] the set 
consisting of all finite sums of elements of the form xmy for x and y in R, 
m in H. It is easy to show that {/?} is an ideal. If H consists of finitely 
many elements mi, . . . , m« of fJ, we write {//} = {mi, . . . , mt], and if H 
consists of only one element m of R, we write 

(16) M = {m} 

for the corresponding ideal. This most important type of an ideal is called 
a principal ideal. It consists of all finite sums of elements of the form amb 
for a and b in R. When 12 is a commutative ring, M = {m] consists of all 
products am for a in R. 

The ring R itself is an ideal of R called the unit ideal. This term is de¬ 
rived from the fact that in the case where R has a unity quantity R = {1}. 
Evidently {0} is the zero ideal. 

Let M he an ideal of R and define 

(17) a^b{M) 

(read: a congruent b modulo M) if a — 6 is in M. We may then define 
what we shall call a residue class gofM for every a of R. We put into the 
class every b in R such that a ^ b (M). Clearly a s 6 (M) if and only if 
6 s a (M). Moreover, if a — 6 is in M and b — c in Af, then (a — 6) + 
(6 — c) = a — c is in Af. It follows that g = 6 (a and b are the same resi¬ 
due class) if and only if b is in g. 

Let us now define the sum and product of residue classes by 


(18) 


g -b 6 = a + 6 , 


a • 6 = a • 6. 
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It may be verified readily that if Oi = g, ^ =5, then ai + hi = a + b , 
aihi = ab. It follows that our definitions of sum and product of residue 
classes are unique. Then it is easy to show that if M is not R the set of all 
the residue classes forms a ring with respect to the operations just defined. 
We call this ring the residue class or difference ring R — M {read: R minus 
M). When M — R the residue classes are all the zero class, and we have 
not called this set a ring. 

When the residue class ring E — M is an integral domain, we call the 
ideal M a prime ideal of R, We call M a divisorless ideal of 72 if /2 — M is 
a field. These concepts coincide in the case where R — M has only a finite 
number of elements since it may be shown that any integral domain with a 
finite number of elements is a field. This coincidence occurs in most of the 
topics of mathematics (in particular the Theory of Algebraic Numbers) where 
ideals are studied. 

7. The ring of ordinary integers. The set of all ordinary integers is a ring 
which we shall designate henceforth by E, It is easily seen to be an inte¬ 
gral domain, and we shall prove that it has the property of unique factori¬ 
zation. 

We observe first that the units of E are those ordinary integers u such 
that uv = 1 for an integer v. But then 1 and — 1 are the only units of £?. 
Thus the primes of E are the ordinary positive prime integers 2, 3, 5, etc., 
and their associates —2, —3, —5, etc. Every integer a is associated with 
its absolute value j a ] ^0. We note now that if b is any integer not zero, 
the multiples 

( 19 ) 0 , | 6 |, -| 6 |, 2 | 61 , - 2 | 61 ,... 

are clearly a set of integers one of which exceeds* any given integer a. 
Then let (g + l)\b\ be the least multiple of 16| which exceeds a so that 
g\b\ <a, (g + l)|6| > a, a — g\b\ = r such that 0 < r < |6|. We put 
q = giib>0yq = —g otherwise, and have g\b\ = g5, a = + r. If also 

a = bqi + ri with 0 < ri < 161, then b{q — gi) = ri — r is divisible by 6, 
whereas jr — ri| < 16|. This is possible only if ri = r and qi = q. We 
have thus proved the Division Algorithm for E, a result we state as 

Theorem 1. Lei a and h 9 ^ 0 be integers. Then there exist unique integers 
q and r such that 0<r< I6l,a = bq + r. 

We now leave for the reader the application of the Euclidean process, 
which we used to prove Theorem 1.5, to our present case. The process yields 


* We use the concept of magnitude of integers throughout our study of the ring E. 
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Theorem 2. Let f and g he nonzero integers. Then there exist integers a 
and b such that 

(20) d = af + bg 

is a positive divisor of both f and g. Then d is the unique positive g.c.d. of f 
and g. 

The result above implies 

Theorem 3. Let f, g, h be integers such that f divides gh and is prime to g. 
Then f divides h. 

For by Theorem 2 we have af -^-bg = 1, a/A + bgh = h. But by hypothesis 
0^ — f9)fi — {<di + bq)f is divisible by f. 

We then have 

Theorem 4. Let pbea prime divisor of gh. Then p divides g or h. 

For if p does not divide g, the g.c.d. of p and g is either 1 or an associate 
of p. The latter is impossible, and therefore p is prime to g. 

We also have 

Theorem 6. Let mbe an integer. Then the set of integers prime to m is 
dosed with resped to multiplication. 

For let o and b be prime to m and d be the g.c.d. of ab and m. If ah is not 
prime to m we have d > 1, and by Theorem 3 if d is prime to a it divides b. 
But then a divisor c > 1 of d divides a or & as well as m contrary to hy¬ 
pothesis. 

We may now conclude our proof of what is sometimes called the Funda¬ 
mental Theorem of Arithmetic. 

Theorem 6. Every composite integer a is expressible in the form 

(21) a = ±pi. . . Pr 

uniquely apart from the order of the positive prime f odors pi, ... , p,. 

For if a = 6c, every divisor of 6 or of c is a divisor of a. If a is composite, 
it has divisors 6 such that 1 < 6 < |a|, and there exists a least divisor 
Pi > 1 of a. But then pi is a positive prime, a = pio* for los] < |a|. If 
Os is a prime, we write 02 = ± P 2 with pt a positive prime and have (21) for 
r = 2. Otherwise ai is composite and has a prime divisor p 2 by the proof 
above, 02 = ptot and a = pip2a3 for joal < I 02 I. After a finite number of 
stages the sequence of decreasing positive integers |a| > {ail > {asl > 

... must terminate, and we have (21). If also a = +gi.. . g, for positive 
primes gi,. . . , q„ the sign is uniquely determined by o, and pi. . . pr = 
qi... jq,. Then either we may arrange the g* so that gi = pi, or pi g, 
for j * 1,... , 8. But if the divisor pi of gi. .. g, is not equal to gi, it does 
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not divide q\ and by Theorem 4 must divide g 2 . . . By our hypothesis 
it does not divide and must divide gs • . . This finite process leads to 
a contradiction. Thus pi = gi, P 2 . . . Pr = g 2 . . . g*, and the proof just 
completed may be repeated, and we may take p 2 = g 2 . Proceeding simi¬ 
larly, we ultimately obtain r = s, the p* = g* for an appropriate ordering* 
of the g*. 

8 . The ideals of the ring of integers. Let M be a nonzero ideal of the 
set E of all integers and m be the least positive integer in M. Then M con¬ 
tains every element qm of the principal ideal {m}. If A is in Af, we may 
use Theorem 1 to write h = mq + r where 0 < r < m. But mq and h are 
inMjh — mq = r is in M, Our definition of m implies that r = 0 and thus 
that every element of M is in {m}. We have proved 

Theorem 7. The ideals of E are principal ideals {m}, m a positive integer. 
The residue classes of E modulo {m} are now the classes 

(22) Q, 1, . . . , m - 1 . 

For if a is any integer, we have a = mg + r for r = 0, 1, . . . , m — 1. 
Then a — r is in {m}, g = r. Thus the elements of the residue class ring 

E - {m} 

defined for m > 1 are given by (22). Then E — [m] is a ring whose zero 
element is the class 0 of all integers divisible by m and whose unity element 
is the class 1 of all integers whose remainder in Theorem 1 on division by 
m is 1. 

If a is an integer prime to m, the elements of the residue class g are all 
prime to m. For by Theorem 2 there exist integers c and d such that 

(23) oc + md = 1. 

If b is in g, then b = a + mg, be + m(d — gc) = oc + m(gc + d — qc) = 
ac + md = 1, and therefore b and m are relatively prime. But g • c = 1, 
and Theorem 7 implies 

Theorem 8. The residue classes a m E — {m} defined for a prime to m 
form a multiplicative abelian group. 

If m is a composite integer cd where c > 1, d > 1, then m > c, m > d, 
and c and d are both not the zero class. But c*d = cd = m = 0yE — {mj 
has divisors of zero and is not an integral domain. If. m is a prime, then 

* We may order the p< so that pi ^ p2 ^ ... ^ p, and similarly assume that gi ^ 
g2 ^ g,. Then we obtain r = s, m i = 1 , . . . , r. 
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every a not in m is prime to m, and Theorem 8 states that E — [m] is a 
field. We have proved 

Theorem 9. An ideal M of E is a prime ideal if and only if M = {p} for 
a positive prime p, M ts a divisorless ideal 

We observe now that o s 6 (M) for Af = [m] means that a — b = mq^ 
that is, a — 6 is divisible by m. Thus it is customary in the Theory of Num¬ 
bers to write 

(24) a s 6 (mod m) 

{read: a congruent b modulo m) if a — 6 is divisible by m. But then g = 6, 
and if we also have c = d, we will have g + c = 5 + das well as g • c = 
b • d. Hence if (24) and 

(25) c = d (mod m) 
hold, we have 

(26) a + c = 6 + d (mod m) , ac ^bd (mod m) . 

Thus the rules (26) for combining congruences are equivalent to the defini¬ 
tions of addition and multiplication in JE — {m}. 

We next state the number-theoretic consequence of Theorem 8 and Lem¬ 
ma 2 which is called Euler's Theorem and which we state as 

Theorem 10. Let f(m) be the number of positive integers not greater than 
m > 0 and prime to m. Then if a is prime to m we have 

(27) a^^“^ = 1 (mod m) . 

For f(m) is clearly the order of the multiplicative group defined in Theo¬ 
rem 8. Our result then follows from Lemma 2. 

We next have the Fermat Theorem, 

Theorem 11. Let pbe a prime. Then 

(28) aP s a (mod p) 
for every integer a. 

For (28) holds if a is divisible by p. Otherwise a is prime to p, and g is 
one of the residue classes 1,2,..., p — 1 . Thus /(p) = p — 1 and by 
Theorem 10 — 1 is divisible by p, o(op~^ — 1) = o*" — a is also di¬ 

visible by p. 

The ring E — {p} defined by a positive prime p is a field* P whose 

* This field is equivalent to the subfield generated by its unity quantity of any field 
of characteristic p. 
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nonzero elements form an abelian group P* of order p ~ 1, The elements 
of P* are the distinct roots of the equation = 1 and are not all roots 
of any equation of lower degree. Then it may be shown that P* is a cyclic 
multiplicative group [r] where r is an integer such that 1, r, rf, . . . , are 
a complete set of nonzero residue classes modulo {p}. Such an integer r is 
called a primitive root modulo p. We then have 

Theorem 12. Let p be a prime of the form 4n + 1. Then there exists an 
integer t such that 

(29) t^ + 1 = 0 (mod p) . 

For p — 1 = 4n, and we let r be a primitive root modulo p, t = r**. Then 
- 1 = («2 + 1)(^2 _ 1 )^ (^_2 _|_ - 1) = g in the field E - 

{p}. However, 5*^ 1 since r is primitive, ^ + 1 = 0 as desired. 

There is a number of other results on congruences which are corollaries 
of theorems on rings and fields. However, we shall not mention them here. 

9. Quadratic extensions of a field. If F is a subfield of a field K which 
is a linear space uiF + . . . UnF over F, we say that K is a field of degree 
n over F. The theory of linear spaces of Chapter IV implies that ui may be 
taken to be any nonzero element of K, Hence we may take Ui to be the 
unity element 1 of F. Then if n = 1, the field K is F. We call K a quad¬ 
ratic, cubic, quartic, or quintic field over F according as n = 2, 3, 4, or 5. 

Let n = 2 so that K has a basis Ui = 1, U 2 over F. The quantities of K 
are then uniquely expressible in the form fc = Ci + C 2 U 2 for ci and C 2 in F, 
and fc is in F, fc = fc • 1 + 0w2, if and only if C 2 = 0. Clearly, if fc is in F, 
then 1, fc do not form a basis of K over F. We now say that a quantity u 
in K generates K over F if 1, u are a basis of K over F. Then u generates K 
over F if and only if u is not in F. For if 1, u are linearly dependent in F we 
have ai + a 2 U = 0 for 02 9 ^ 0, u= is in F. 

The elements fc of a quadratic field have the property that 1, fc, fc* are 
linearly dependent in F, cofc* + Cifc + C 2 = 0 for Co, 'Ci, C 2 not all zero and 
in F. If Co = 0, then ci cannot be zero, and fc = is in F, fc is a root 

of the monic polynomial (x — fc)* with coefficients in F. If fc is not in F, 
then Co 0, fc is a root of a monic polynomial of degree two. Thus every 
element fc of a quadratic field is a root of an equation 

(30) /(x, fc) = X* - T{k)x + N(k) = 0 

with r(fc) and Nik) in F. In particular, if u generates K over F we have 

(31) - u* — 6u + c = 0 
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for b and cm F, and we propose to find the value of T(k) and N(k) as poly¬ 
nomials in 6, c and the coordinates ki and ^2 in JP’ of fc = fci + k 2 U. 

Define a correspondence S on K to Khy 

(32) k = ki + kau —► = fci + k2U^ , 

for all ki and fca in F where is in K. Then (32) is a one-to-one corre¬ 
spondence of K and itself if and only if generates K over F. If ko = 
kz + kAU for kz and ^4 in F we have fc? = *3 + kAU^, fc + fco = (fci + kz) + 
(^2 + kA)Uf so that (k + ko)^ = ki + kz + (kz + kA)u^y and we have 

(33) (k + fco)^ ^k^ + kS. 


Also kko = kikz + (kikA + k2kz)u + kzkAU^ = (fcifca kzkAc) + (k\kA + kzkz 
+ k 2 kAb)u, while = kikz + (kJcA + kzk^u^ + kJcA{u^Y, But then 

(34) (fcfco)^ = k^k! 

if and only if (u^Y = bu^ — c, that is, is a root in K of the quadratic 
equation — 6a; + c = 0. But the quadratic equation can have only two 
distinct roots in a field Ky the sum of the roots is 6, 

(35) — u or b — u . 

In the former case S is the identity correspondence k <—> k. In either case 
S defines a self-equivalence of K leaving the elements of F unaltered and 
is called an aiUomorphism over F of K. 

If K were any field of degree n over F and if S and T were automorphisms 
over F of Ky we would define ST as the result k —» k^'^ = (A;'®)^ of apply¬ 
ing first k-^k^ and then k^ —> (A;^)^. It is easy to show that the set of all 
automorphisms over F of K is a group G with respect to the operation just 
defined. In case G has order equal to the degree n of K over F it is called 
the Galois group of K over F, and that branch of algebra called the Galois 
Theory is concerned with relative properties of K and G. In our present 
case == (u^)^ = (5 — uY = 6 — (6 — u) = u so that S* is the identity 
automorphism. But 6 — w = w if and only if 2u = 6, which is not possible 
since u is not in F unless X is a modular field in which u + w = 0. Hence 
if X is a nonmodular quadratic field over F, the automorphism group of X 
over F is the cyclic group [S] of order 2 and is the Galois group of X. 

We see now that if A; = A;i + kzUy then kk^ = (A;i -f A; 2 w)[A;i + kzib — 
u)] = A;f + kjcjb + klu{b — w). But bu — — c. Hence if 

T(k) = A; + A;^ , N{k) = kk^ , 


(36) 
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we have 

(37) T(k) = 2 ki + kj> , N(k) = fc? + klc + kM , 
and the polynomial of (30) is 

(38) f(x, k) = (x — k)(x — k^) . 

The function T{k) is called the trace of k, and N(k) is called the norm of k. 
Moreover, we may show that 

(39) T{a\k + ajc^i) ==■ a\T{k) + a^TQcf^ , N{kko) = N(k) • iV(fco) 

for every ai and 02 of F and k and fco of K, For T(aik + a2fco) = (uifc + ajco) + 
{ajc + a 2 fco)‘^ = aik + 02^0 + aik^ + = ai(fc + k^) + 02(^0 + k^) as 

desired. Similarly,* N{kko) == kkoikko)^ = kkok^k§ = (kk^){kok^). Note that 
if A; is in F, then N(k) = 

Let us now assume that the field K is nonmodular so that K — F + uF 
where u satisfies (31). Then K is also generated by the root u — of the 
equation 

(40) = a (a in F) , 

if a = (w — b/ 2 y = — 6u + 6V4 = — c + 6V4. Let us then assume with¬ 

out loss of generality that w is a root of (40) so that in (31) we have 6 = 0, 
c = — a. Then (37) has the simplified form given by 

(41) T{k) = 2 ki , N(k) = fc? - kla . 

Now if a = for d in F, we have = cP, (u + d)(u — d) = 0, whereas w is 
not in F and u + dj^OyU — dj^O, This is impossible in a field. 

The quantities of K now consist of all polynomials in u with coefficients 
in F. For if k{x) is any polynomial in F[x], we may write k{x) == /ci(x*) + 
k 2 {x^)Xyk(u) = fci(a) + k 2 (a)u. Thus every nonmodular quadratic field is the 
ring F[u] of all polynomials with coefficients in F in an algebraic root u of an 
equation = a where a ts in F, a is not the square root of any quantity of F, 
and F is nonmodular. Since if is a field, it is actually the field F(u) of all 
rational functions in u with coefficients in F. 

Conversely, if the nonmodular field F and the equation x* = a in F are 
given, then K is defined. For we may take 

<“) “‘(1 o)’ “’“(0 a) 

* The trace function is thus called a linear function and the norm function a miiUipU- 
caiive function. 
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and identify F with the set of all two-rowed scalar matrices with elements 
in F (so as to make K contain F), Then every polynomial in u has the form 
k — ki + kiu and N(k) = /bf — kla = 0 if and only if fci = ^2 = fc = 0. 
For otherwise A ;2 5 *^ 0, a = (kik^^y contrary to hypothesis. But then we 
take K = F + uF and have k'~^ = (fcf — “ ^2^) for every k of 

K. Hence K is the field F(u), That K is nonmodular is evident since the 
unity element of K is 'that of F, We have thus constructed all quadratic 
fields K over a nonmodular field F in terms of equations = a for a in 
F, a 9 ^ for any d of F. 

Observe in closing that if jK = F(u), then K == F{v) for every v — bu 
such that b 0 is in F. But t; is a root of 

(43) X* = ¥a . 

Thus we may replace the defining quantity a in F by any multiple ¥a for 
b 9 ^ 0 in F. It is also shown easily that if K is defined by a and Kq by Uo, 
then K and Kq are fields equivalent over F if and only if ao = ¥a. 

10. Integers of quadratic fields. The Theory of Algebraic Numbers is 
concerned principally with the integral domain consisting of the elements 
called algebraic integers in a field K of degree n over the field of all rational 
numbers. We shall discuss somewhat briefly the case n = 2. 

Let, then, X be a quadratic field over the rational field so that K is 
generated by root u of u* = a where a is rational and not a rational square. 
By the use of (43) we may multiply a by an integer and hence take a inte¬ 
gral. Write a — (?d where d has no square factor and c and d are ordinary 
integers. If we take b — cr^ in (43) we replace a by d. Hence every quadrat¬ 
ic field is generated by a root u of the quadratic equation 

(44) a;2 = a = ±pi . . , Pr, 

where the p< are distinct positive primes and r > 1 for a > 0, while if 
a = — 1 we interpret (44) as the case a negative and r = 0. 

The quantities k of K have the form ifc = fci + kiu where ki and ^2 are 
ordinary rational numbers. We call k an integer of K if the coefficients 
T(Jc)f N(k) of (30) are integers. Thus k is an integer of X if and only if 
2ki and kl — kla are both ordinary integers. We shall determine all integers 
of K stating our final results as 

Theorem 13. The integers of a quadratic field K/om an integral domain J 
containing the ring E of all ordinary integers. Then J consists of all linear 
cambinaiions 


(46) 


Cl + C2W 
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for Cl and C 2 in E, where w = u i/ a = 2 (mod 4) or a = 3 (mod 4) hvJt 


(46) 


w = 


1 + U 
2 


i/ a s 1 (mod 4). 

Note that a ^ 0 (mod 4) since a has no square factor. We write 


fci = 


bo’ 



k = 


bi -f” 62W 

57 “ 


for 60 , biy 62 in E and bo the positive least common denominator of ki and fc 2 . 
Then 1 is the g.c.d. of 60 , 61 , 62 - Now 2 fci is an integer, bo divides 2 bi, 
kl — fc|a = bQ^{hl — bla)y so that bl divides bl — 6 |a. If p were an odd prime 
factor of 60 , it would divide 26i only if it divided 5i. But ther* p^ would 
divide 6 ?, and p^ would divide bl — b\a as well as — 5|a, Since a has no 
square factors this is possible only if p divides 62 , a contradiction. Hence 
60 is a power of 2. If 60 > 4 divides 26i, then 2 divides 61 , 4 divides 6 f and 
— 6 |a, and thus 2 divides 62 . We have a contradiction and have proved that 
feo = 1, 2. If feo = 2 and 61 is even, then 4 divides —51^, and hence 2 
divides bz contrary to the definition of bo. Similarly, if ^2 were even, then 
4 would divide 6 f, a contradiction. Hence 61 = 2mi + 1, 62 = 2 m 2 + 1 for 
ordinary integers nii and m 2 , and bl — 6 |a = 4[mf + mj — a(m| + m 2 )] + 
1 — a is divisible by 4 if and only if a = 1 (mod 4). Thus we have proved 
that J consists of the elements (45) with it; = u if a = 2, 3 (mod 4). But 
if a = 1 (mod 4), then we have shown that either k = bi + b 2 U with bi 
and 62 in E, 60 = 1, or fc = mi + m 2 U + w for w in (46). However, u = 
2it; — 1, and in either case k has the form (45) with ci and C 2 in E. 

It remains to show that J is an integral domain. The elements of J are 
in the field K, and thus it suffices to show that k + h,kh are in J for every 
k and h of J. But the sum of k and h of the form (45) is clearly of that 
form, the product 

(ci + C2w){di + d2w) = Cidi + (cid2 + C2d^w + c^2y^ 

is of the form (45) if is of that form. But this condition holds since if 
w = u then = a + Ow, while otherwise = a — 4cm + 1 for an inte¬ 
ger m, T(w) = i + i = 1, N^w) = i(l — «) = —rn, — w — m ^ 0, 

(47) it;2 = m + w;. 

This completes the proof. 
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The units of J are elements k such that kh =' 1 for h also in J. Then 
NQch) = N(k)N(h) = l,iV(fc) is an ordinary integer which divides unity, 

(48) iV(fc)=±l. 

Conversely, if (48) holds, k has the property kk^ = 1 or — 1 . But k^ is in 
J when fc is in J since T(k^) = T(k) and N(k^) = N{k). Hence k^ = or 
— Thus (48) is a necessary and sufficient condition that fc be a 

unit of J. 

If a s 2, 3 (mod 4) so that A; = ci + ciUy then (48) is equivalent to 

(49) c?-cia= ±1. 

However, if a = 1 (mod 4), then (48) becomes N{k) = c? + Ci(C 2 — = 

±1 so that 4iV’(A;) = 4c? + 4 ciC 2 + c? — (4m + l)c? = ±4. But this is 
equivalent to 

(50) (2ci + C 2)2 - cla = ±4. 

We may determine the units of J completely and simply in case a is 
negative. For both (49) and (50) have the form x? + = ± 1 , ±4 for 

^ = — a > 0, and this is possible for ordinary integers Xi and X 2 and gr > 4 
only if X 2 — 0, Cl = 1 , — 1 . Now a has no square factors, and hence fir 5 *^ 4, 
the only possible remaining cases are ^ = 1, 2, 3. If fir = 2 we have x? + 
2x1 = 1 only if X 2 = 0 . Hence we have proved that the units of J are 1,-1 
for every a < 0 save only a = — 1 , —3. 

Now let a = — 1 so that (49) becomes c? + c? == 1 . Then one of ci and C 2 
is zero, the other is 1 or — 1 , and the units of J are 

(51) 1, u, —w, = — 1. 

In the remaining case a = —3 we use (50) and have ( 2 ci + C 2 )^ + 3c? = 
4, and if C 2 5 *^ 0 we must have C 2 = ± 1, 2ci + C 2 = ±1 with any choice 
of signs. Then C 2 = 1 gives Ci = 0 or — 1 , while C 2 = — 1 gives 0 or 1 
for Cl. Clearly, the units of J are 

(52) 1, —1, w, —tc, w^y . 

The units of J form a multiplicative group, and we have shown that if a is 
negative this group is a finite group. 

If a = 2, then k - Z + 2u has norm 9 — 4u2 = 9 — 8 = 1. Hence h is 
a unit of J, and so is ft* for every integer t If ft* « ft* for s 9 ^ t, we may take 
t > 8 and have ft*“* = 1. But we may regard u as the ordinary positive V 2 
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and have A > 5, A*""* > 5^“* > 1. Hence the multiplicative group of the 
units of J is an infinite group. It may similarly he shown to be infinite for 
every positive a. 

We shall not study the units of quadratic fields further but shall pass on 
to some results on primes and prime ideals in two special cases. 

11. Gauss numbers. The complex numbers of the form x + yi with ra¬ 
tional X and y are called the Gauss complex numbers^ and those for which 
X and y are integers are called the Gauss integers. Then the Gauss complex 
numbers are the elements of a quadratic field K of our study with 

u^ = — 1 , 

and the Gauss integers comprise its integral domain J. We have deter¬ 
mined the units ±1, ±u of J and shall now study its divisibility properties. 
Our first result is the Division Algorithm which we state as 

Theorem 14. Let f and g 9 ^ 0 be in J. Then there exist elements h and r 
in J such that 

(53) f = gh + r 
and 0 < N(r) < N(g). 

For fg~~^ is in K, = fci + k 2 U with rational fci and fc 2 . Every rational 
number t lies in an interval 5 < < < s + 1 for an ordinary integer s. If 
s < ^ < 5 + ^ then {t — s) < while if s + ^<^<s+l then — 
(s + 1) I < Hence there exist ordinary integers hi and Aa such that 

(54) Si = /bi —Ai, 52 = ^2 —Aa, l5i|<i, |s2|<|. 

Put A = Ai + Aau, s = 5i + S 2 Uj r = sg so that fg~^ = h + s, f = gh + 
sg = gh + r. Then N{s) = si + si < N{r) == N{s)N{g) < N(g) as de¬ 
sired. 

We observe that the quotient A and the remainder, r need not be unique in 
our present case. For example, if / = 2 + u and gr = 1 + w, we have 
Nig) = 2. Then/ = l- (/ + l = 2- g- u with Nil) = iV(~w) = 1. 

We shall use the Division Algorithm to prove the existence of a g.c.d. 
Our proof will be different and simpler than that we gave in the case of 
polynomials and indicated in the case of integers but has the defect of being 
merely existential and not constructive. We first prove 
Theorem 16. Every ideal M 0 / J is a principal ideal. 

For the norms of the nonzero elements of M are positive integers, and 
there exists an m 5 *^ 0 in Af such that Nim) < Nif) for every / of M. By 
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Theorem 14 we have f — mh + r lor h and r in J and N{r) < N{m). But 
/, m and mh are in M, so is r = / — mA, and it follows that r must be zero. 
Thus / = mhy M = jm}. 

The above result then implies 

Theorem 16. Let f and g he nonzero elements of J. Then there exist b and c 
in J stich that 

(55) d = bf + eg 

is a common divisor of f and g. Thus d is a g.c.d. of f and g. 

For the set of all elements of the form xf + yg with x and t/ in J is an 
ideal M of J. By Theorem 15 we have M = {d}, d in M has the form (55) 
and divides / = 1 • / + Og, and g = 0*/+l*ginM. 

The above result may now be seen to imply that if / divides gh and is 
prime to g, then/ divides ft, and also if p is a prime of J dividing gft, then p 
divides g or ft. Moreover, if / is a composite integer of /, then / = gh for 
nonunit g and ft, N(g) < N(f), Then if p is a divisor of / of least positive 
norm it is a prime divisor of /, and we continue the proof of Theorem 6 
to obtain 

Theorem 17. Every composite Gauss integer is expressible as a product 

(56) f = Pi. . . Pr 

of primes pi which are determined uniquely by f apart from their order and 
unit multipliers. 

We observe that Theorem 17 implies that an ideal ilf of / is a prime ideal 
(and, in fact, a divisorless ideal) if and only if Af = {d} for a prime d of J, 
Let us then determine the prime quantities d = di + d 2 U of J. We see first 
that if d is a prime of J, so is d*^ = di — d 2 U. For if d^ = gh with N{g) 1, 
iNr(ft) 51^ 1 , we have d = g^h? for = iV((;), iV(ft^) = iV(ft), and d is 

composite. We now prove 

Theorem 18. A positive prime p of E is either a prime of J or the product 
N(d) == dd®, where d is a prime of J. Every prime d of 3 is either associated 
with a positive prime of E or arises from the factorization in J of a positive 
prime p = dd® of E which is composite in J. 

For if p is a positive prime of E and is composite in J, then p = dk for 
N{d) > 1, N{k) > 1, N(p) = p* = Nid)N{k). But then p = N(d), If d = 
gh with N(g) > 1, then N{d) = p = N(g)N{h) so that N{h) = 1, ft is a 
unit, d is a prime. Conversely, let d be a prime of J. Then c = N{d) is a 
positive integer of E and is either a prime or is composite. But c = dd^ can 
have at most two prime factors in E, c = ppo for positive primes p and po 
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of E. We may assume that p is associated with d and p© with so that 
Po = Po is associated with d and with p. Since p and p© are both in E, we 
have p© = ± p. But p and p© are positive and must be equal, N{d) = p^. 
Therefore d is associated with the prime p of E which is prime in J. 

We now clearly complete our study of the primes of J by proving 

Theorem 19. A positive prime p 0/ E is a prime of J if and only if p has 
the form 4m + 3. 

To prove this result we observe first that if t is an odd integer of E we 
have < = 2s + 1, = 4s^ + 4s + 1 = 4s(s + 1) + 1. One of s and s + 1 
is even, = gr + 1 = 1 (mod 8). If < is even, we have t^ = 0 (mod 4) 
and = 0, 4 (mod 8). Thus a sum of two squares is congruent to 0,1, 2, 4, 
or 5 modulo 8 while 4n + 3 is congruent 3 or 7 modulo 8. It follows that 
p = 4n + 3 5*^ + 2/^. We now assume that p = gh for g and A in J and 

have p2 = iV(p) = N(g)N(h), If neither g nor h were a unit, we would 
have N{g) > 1, and both of these integers would be divisors of p^. But 
then AT(gf) = NQi) = p = + 2/^ which is impossible. Hence, p = 4n + 3 

is prime in J, We note that 2 = 1 + 1 = (1 + w)(l — w) is composite in 
J and that it remains to show that p = 4n + 1 is composite in J. We know 
by Theorem 12 that there exists an integer 6 in E such that 6^ + 1 is divisible 
by p. If p divides 6 + u or 6 — w, then b ± u = p(ki + fc2w), and ± 1 = 
p/b2, which is impossible. But a prime p of J cannot divide the product 
{b + u){b — u) = ¥ + I without dividing one of its factors b + u, b — u. 
Hence p is not a prime of J. This completes the proof. 

We use the result above to derive an interesting theorem of the Theory of 
Numbers, We call a positive integer c a sum of two squares if c = x® + ^2 
for X and y in E. Then we have 

Theorem 20. Write c = Pg where f and g are positive integers and g has 
no square factors. Then g is a sum of two squares if and only if no prime factor 
of g has the form 4n + 3. 

For if c = x^ + 2/2 = (x + yu){x — yu), we may write x + yu ^ di 
,,, dr for primes d* mJ,c = N{di) , . . iV(dr). Then N{di) = pt is a prime 
of E if and only if pi 5*^ 4n + 3 and otherwise N{d^ = p?. Thus the prime 
factors of c of the form 4n+ 3 occur to even powers and are not factors of g. 
Conversely, if gf = pi. . . pr for positive primes p< of E not of the form 4n + 
3, we have p< = iV(d<) for d* in J, g = Nidi) . . . AT (dr) = N{di ... dr), 
and c = N{fdi. . . dr) ^ N{k) = x^ + y^ for k = x + yu in J. 

Note in closing that a positive prime p of the form 4n + 3 divides x^ + y^ 
if and only if p divides both x and y. For p is a prime of J and divides 
(x + yu)ix — yu) if and only if p divides either x + yu or x — yu. Then 
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X ±yu = p{ki ± ktu), X = pki, y - pkt for integers ki and kt of E. It 
follows, then, that if x, y, z are ordinary integers such that 

(57) X* + y* = 2*, 

every prime divisor p = 4n + 3 of 2 divides both x and y. Then let x, y, 2, 
have g.c.d. unity. It follows that the odd prime divisors of 2 have the form 
4n + 1. We may show readily also that x and y cannot both be even or be 
odd and that z must be odd.* 

12. An integral domain with nonprincipal ideals. We shall close our ex¬ 
position with a discussion of some properties of the ring J of integers of 
the field E defined by o = m* = —5. Since — 5 s 3 (mod 4), the elements 
of J have the form ki ktU for fci and fc* in the set E of all integers. Ob¬ 
serve that if b is an integer of E which is a divisor in J of ki + ktu, then b 
must divide both ki and ki. For ki -|- kiu = b{hi + htu), k\ = bh\, ki = 
bht. We now prove 

Theorem 22. The elements 3, 7, 1 -f 2u, 1 — 2u are primes of J no two 
of which are associated. 

For if A; is a composite of J we have k = gh, N(,k) = N(g)N(h) for ordi¬ 
nary integral proper divisors N{g) and Nih) of N{k). The norms of the 
integers of our theorem are 9,49, 21, 21, respectively, and the only positive 
proper divisors of these norms are 3, 7. But if p = pi giU, we have 
Nig) = Pi + bg\ > 0, pf -1- 5p| = 3,7. Evidently pj 0, pi < 1. But pi = 1 
is impossible since pf 5^ —2, 2. Thus 3, 7, 1 -f 2w, 1 — 2 m are primes of J. 
The units of J are 1, —1, and clearly no two of them are associated. 

We see now that 21 = 3 • 7 = (1 -}- 2u)(l — 2m), and we have factored 
21 into prime factors in J in two distinct ways. Moreover, the principal 
ideal {3} of J defined for the prime 3 is not a prime ideal. For 3 does not 
divide 1 -|- 2m and 1 — 2 m in / yet does divide their product, and therefore 
the residue class ring J — {3} contains 1 -|- 2m and 1 — 2m as divisors of 
zero. 

The ring J contains nonprincipal ideals one of which we shall exhibit. 
We let M be the ideal of J consisting of all elements of J of the form 

(58) 3x + y(l -I- 2m) (x, p in J). 

If this ideal were a principal ideal {d}, there would exist a common divisor 
d of 3 and 1 -|- 2m. Since these are nonassociated primes, d must be a unit 

* For further results see L. E. Dickson’s Introductwn to the Theory of Numbers, pp. 
40 - 42 . 
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1 or -1 of J, and {d] = {!}. Then 1 = 36 + (1 + 2u)c for 6 and c in J, 

7 = 216 + (1 + 2u)7c = (1 + 2w)[6(l — 2u) + 7c]. But 1 + 2u does not 
divide 7, a contradiction. We now proceed to prove that M is a divisorless 
ideal of J and hence is a prime ideal of J. 

We have shown that M is not a principal ideal and does not contain 1. 
Since M contains 3 but not 1 it cannot contain 2. For otherwise 3 — 2 = 1 
would be in M, Let c be an integer of E in M so that c = 3g + r where 
r = 0, 1, 2. Since M contains Zq it contains c — 3g = r, r = 0. Hence 
the ordinary integers in M are divisible by 3. 

The ideal M contains 3 and 1 + 2w and so contains 

(59) iCi = 3 , tC2 = 3w — (1 + 2u) = w — 1. 

If h is in Mj then h = ho + h 2 U for ho and 62 in E, 6 — 62^2 = 60 + 62 in 
E and in M, By the proof above 60 + 62 = 36i for 61 in E, 

(60) 6 = hiWi + h2W2 . 

We have thus proved that M consists of all quantities of the form (60) for 
61 and 62 ordinary integers. Thus we may call wi, W 2 a basis of M over E. 
Note that {1} has a basis 1, u over E in this sense. It may be shown that 
every ideal of J has a basis of two elements over E. 

Every integer of J has the form fc = 61 + 62^ = fci + fc2^^^2 + 62- Write 
fci + 62 = 3c + r for c in E and r = 0, 1, 2. Then k = cwi + k 2 W 2 + r = 
r (Af), the elements of J — M are the residue classes 0,1, 2 such that 3 = 0. 
We have proved that J — M is equivalent to E — {3} and is a field, M is 
a divisorless ideal of J, 

This completes our discussion of ideals and of quadratic fields. We shall 
conclude our text with the following brief bibliographical summary. 

Let us begin with references to standard topics on matrices not covered 
in our introductory exposition. The theory of orthoganal and unitary equiv¬ 
alence of symmetric matrices is contained with generalizations in Chapter V 
of the Modern Higher Algebra and is further generalized and connected with 
the theory of equivalence of pairs of matrices in the author^s paper entitled 
^^Symmetric and Alternate Matrices in an Arbitrary Field,in Transactions 
of the American Mathematical Society^ XLIII (1938), 386-436. See also pages 
74-76 and Chapter VI of L. E. Dickson^s Modern Algebraic Theories, and 
Chapter VI of J. M. H. Wedderburn’s Lectures on Matrices, Both of these 
texts as well as Chapters III, IV, and V of the Modern Higher Algebra in¬ 
clude, ot course, all the material of our Chapters II-V, For a discussion 
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without details but mth complete references of many other topics on ma¬ 
trices see C. C. MacDuffee, The Theory of Matrices (“Ergebnisse der Mathe- 
matik und ihrer Grenzgebiete,” Vol. II, No. 5 [pp. 110]). 

The theory of rings as given here is contained in the detailed discussion 
of Chapters I and II of the Modern Higher Algebra, and the theory of ideals 
in Chapter XL See also the much more extensive treatment in Van der 
Waerden’s Moderne Algebra. The units of the ring of integers of a quadratic 
field are discussed on page 233 of R. Fricke’s Algebra, Volume III, and its 
ideals on pages 106-10 of H. Hecke’s Theorie der aJgebr&ischen ZaKUn. We 
close with a reference to the only recent book in English on algebraic num¬ 
ber theory, H. Weyl’s Algebraic Theory of Numbers (Princeton, 1940). 
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of a matrix, 20 
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space, 72 
Commutative: 
group, 110 
matrices, 37 
ring, 113 
Complementary: 
minor, 27 
spaces, 77 
submatrix, 21 
Complex oonjugate, 107 


Complex numbers, 75 
field of, 56 
of Gauss, 127 
Composite quantity, 115 
Congruent: 
matrices, 51 
quantities, 116 
Conjunctive matrices, lOS 
Constant, 1, 19 
Coordinate, 66 
Correspondence, 73 

Definite symmetric matrix, 64 
Degree of: 

a polynomial, 2, 7, 97 
a term, 6 

zero polynomial, 2 
Dependent vectors, 67 
Determinant, 26 
order of, 27 
of a product, 44 
(-rowed, 27 

Diag {Ai,..., Afc}, 31 
Diagonal: 
blocks, 31 
elements, 23 
matrix, 29 
of a matrix, 23 
Difference, 57 
Difference ring, 117 
Distributive law, 60 
Divisibility, 115 
Division, right, left, 98 
Division algorithm for: 

Gauss numbers, 127 
integers, 117 
matric polynomials, 97 
polynomials, 4 
Division ring, 113 
Divisors of zero, 113 

Element: 
identity, 110 
inverse, 110 
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Element—conitmied 
unity, 113 
zero, 112, 113 
Elementary: 
divisors, 94 
matrix, 92 

Elementary transformations, 24, 25 
cogredient, 62 
matrices of, 43 
Equations, systems of, 19, 79 
Equivalence, 73, 74 
Equivalence of; 
bilinear forms, 49 
forms, 17 
groups, 110 
linear spaces, 74 
matrices, 47, 89, 92 
quadratic forms, 69 
Euclid’s process, 10 
Euler’s theorem, 120 

Factor theorem, 5, 99 
Fermat theorem, 120 
Field, 56, 114 

characteristic of, 114 
of complex numbers, 56 
degree of, 121 
generating quantity of, 121 
Forms, 7, 13; see also Bilinear forms 
Function, 73 

characteristic, 101 
minimum, 101 

Fimdamental Theorem of Arithmetic, 

Galois group, 122 
Gauss integers, 127 
Gauss numbers, 127 
General linear space, 74 
Greatest common divisor, 9 
of Gauss numbers, 128 
of integers, 118 
of polynomials, 9 
process, 10 
Group, 109 
abelian, 110 
additive. 111 
commutative, 110 
cyclic. 111 
order of a, 110 


order of a subgroup of, 110 
subgroup of, 110 
zero element of, 112 
Groups, equivalence of, 110 

Homogeneous polynomial, 7 

Ideal, 116 

divisorless, 117 
prime, 117 
principal, 116 
imit, 116 
zero, 116 
Identity: 

element, 110 
matrix, 30 

Independent vectors, 67 
Integers: 

algebraic, 124 
congruent, 120 
ordinary, 117 
Integral domains, 114 
Integral operations, 1 
Invariant factors, 91 
Inverse: 

element, 110 
mapping, 46 
matrix, 45 

Irreducible quantities, 115 

Leading coefficient, 3, 97 
virtual, 3, 97 
Leading form, 8 
Left division, 98 

Length preserving transformation, 86 
Linear: 

combination, 15 
dependence, 67 
form, 15 

independence, 67 
space, 66, 74 
subspace, 66, 71 
Linear change of variables, 38 
Linear mappings, 17, 82 
cogredient, 51 
inverse of, 17, 46 
matrix of, 36, 83 
nonsingular, 17 
product of, 36 
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Linear space, 66, 71, 74 
basis of, 67 
order of, 71 

Linear transformations, 84 
orthogonal, 86 

Mappings; see Linear mappings 

Matrices: 

addition of, 59 
commutative, 37 
congruent, 51 
congruent pairs of, 108 
conjunctive, 108 
equivalent, 47, 89, 92 
equivalent pairs of, 108 
orthogonal, 86 
orthogonally equivalent, 108 
products of, 36 

rationally equivalent, 25, 34, 47 
similar, 85, 103 
unitary equivalent, 108 

Matrix: 

adjoint of, 29 

augmented, 80 

of a bilinear form, 49 

characteristic, 101 

characteristic function of a, 101 

of coefficients, 19 

columns of, 20 

determinant of, 27 

diagonal, 29 

diagonal of, 23 

diagonal elements of, 23 

elementary, 92 

of an elementary transformation, 

elements of, 20 

equal, 21 

Hermitian, 108 

identity, 30 

invariant factors of, 91 

inverse of, 45 

of a mapping, 36, 83 

minimum function of, 101 

nonsingular, 34 

partitioning of, 21 

principal diagonal of, 23 

rank of, 32 

rectangular, 19 


rows of, 20 
scalar, 30 
skew, 52 

skew-Hermitian, 108 
square, 20 

symmetric, 52, 53, 59, 64 
transpose of, 22 
triangular, 29 
unitary, 108 
zero, 30 

Minimum function, 101 
Minors, 27 

Monic polynomial, 3, 100 
Multiple roots, 5 

n-ary g-ic form, 13 
n-dimensional space, 66 
Nonsingular: 

linear transformation, 84 
mapping, 84 
matrix, 34 
quadratic form, 55 
Norm in a field, 123 
Norm of a vector, 86 

Order of: 

a group, 110 
a group element. 111 
a linear space, 71 
Orthogonal: 
matrices, 86 
subspaces, 87 
transformations, 86 
vectors, 87 

43 Point, 66 

Polynomials: 
associated, 5 . 

coefficients of, 6 
constant, 1 
degree of, 2, 7, 97 
divisibility of, 5 
equal, 2 
factors of, 5 
g.c.d. of, 9 
homogeneous, 7 
identically equal, 6 
monic, 3, 100 
n-ic, 13 
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Polynomial&-~-con^tnue(i 
products of, 3 
g-ary, 13 

rationally irreducible, 12 
relatively prime, 11 
roots of, 5 
scalar, 100 

in several variables, 6 
virtual degree of, 2, 6, 97 
zero, 1 

Positive form, 64 
Prime to a pol 3 momial, 11 
Prime quantity, 115 
Primitive root, 121 
Principal diagonal, 23 

Quadratic field, 121 
integers of, 124 
Quadratic form, 54 
definite, 64 
index of, 62 
negative, 64 
nonsingular, 55 
real, 62 

Quadratic forms, equivalent, 59 
Quantities: 
associated, 115 
composite, 115 
congruent, 6, 11 
irreducible, 115 
prime, 115 
relatively prime, 115 
Quartic field, 121 
Quartic form, 13 
Quasi-field, 57 
Quaternary form, 13 
Quaternions, 114 
Quinary form, 13 
Quotient, 57 
right, left, 98 

Rank of: 

an adjoint matrix, 46 
a bilinear form, 49 
a matrix, 32 
a product, 44 
a quadratic form, 54 
a row space, 72 


Rational: 

equivalence, 25 
functions, 8 
operations, 8 
Real: 

quadratic form, 62 
quaternions, 114 
symmetric matrix, 64 
Reflection, 87 
Relatively prime: 
polynomials, 11 
quantities, 115 
Remainder: 
right, left, 98 
theorem, 4, 98 
Residue classes, 116 
Right division, 98 
Ring, 112 

commutative, 113 
difference, 117 
division, 113 
residue class, 117 
Roots, 5 
multiple, 5 
Rotation of axes, 87 
Row, 69 
rank, 72 
space, 69, 72 
Row by column rule, 37 

Scalar: 
matrix, 30 
polynomial, 100 
product, 16, 41 
Scalars, 19 
Self-equivalence, 107 
Semidefinite forms, matrices, 64 
Sequence: 

elements of, 16 
zero, 16 

Similar matrices, 85, 103 
Skew forms, 53 
Skew matrices, 52 
Space; see Linear space 
Spanning a space, 67 
Subfield, subgroup, 110 
Submatrix, 21 

complementary, 21 
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Subring, 113 
Subspaces, 66, 75, 77 
complementary, 77 
sum of, 77 
Subsystems, 110 
Subtraction, 112 
Symmetric bilinear forms, 54 
Symmetric matrices, 52, 53 
congruence of, 59 
definiteness of, 64 

Ternary forms, 13 
Trace, 123 
Transformations, 84 
Transpose, 22 
of a product, 37 
of a sum, 62 

Unary form, 13 

Unique factorization, 115, 118, 127 


Unit ideal, 116 
Units, 115 
Unity element, 113 

Variables, change of, 38 
Vectors, 66 

linearly independent, 67 
norm of, 86 
orthogonal, 87 
zero, 66 

Virtual degree, 2, 6, 97 
Virtual leading coefficient, 3, 97 

Zero: 

divisors of, 113 
element, 112, 113 
ideal, 116 
matrix, 30 
polynomial, 1 
sequence, 16 
space, 67 
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